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The investigation stems from a recommendation by Holt that clinicians should 
have training which makes it possible for them to validate themselves as 
clinical predictors in much the same way as tests are cross-validated. 3 experi- 
ments were devised to provide feedback concerning accuracy of predictions 
in the expectations that feedback could be used to improve performance. The 
“clinicians” studied were undergraduate students, and the prediction task 
involved interpretation of short sentence-completion protocols, In all 3 experi- 
ments there was evidence for the superior performance of those Ss who 
received feedback, but the bulk of the evidence suggested that the feedback 
effect was attributable to enhancement of motivation of the Ss rather than to 


specific informational value. 


The generally low level of accuracy of 
clinical predictions, whether by experts or 
novices, is by now so well known as not to 
need documentation. However, what remains 
is the important question why accuracy is 
so low, and the consequent of its answer, a 
recommendation for improvement in clinical 
prediction. 

Meehl (1954) exposed the poverty of clini- 
cal psychology in the predictive field by 
showing the generally equal or greater ac- 
curacy of mechanical or actuarial methods. 
However, Holt (1958) has suggested that a 
fundamental deficiency in Meehl’s compari- 
sons was that in all instances rather highly 
refined psychometric devices especially con- 
structed for the purpose for which they were 
being employed were being matched against 
clinicians who had not had the opportunity 
to sharpen their skills by the systematic 
development and cross-validation of them 
against the criterion to be predicted. In ad- 
dition, Holt believes, the clinicians were often 
attempting to predict a criterion with which 
they were not familiar and whose nature they 
did not understand. Therefore, Holt con- 
cluded that the comparisons made by Meehl 
were probably irrelevant to the question 
whether clinicians were necessarily inferior 


1 The writers wish to thank Doris Washington and 
Suzanne Flemming for assistance in collection and 
analysis of data. 


to actuarial methods. Other writers (e.g., 
Luft, 1950) have also pointed to the neces- 
sity for clinicians to experience feedback 
concerning the correctness of their predictions 
so that a corrective element may enter their 
systems. 

This paper presents the results of three 
investigations concerned with the general ef- 
ficacy of feedback in improving clinical pre- 
diction. In the studies to be reported, “cli- 
nician” populations consisted of naive college 
undergraduates, and the prediction task is 
somewhat artificial but not totally unlike 
many clinical tasks. Since the clinical material 
to be given judges (Js) in the investigations 
below consist of brief incomplete-sentence 
protocols, it is relevant to point out that 
Jackson (1962) found that in making judg- 
ments about anxiety from sentence comple- 
tions, naive Js and expert clinicians were in- 
fluenced in about the same degree by the 
same parameters of the response protocols. 

With respect to the effect of feedback on 
predictions, these investigations clearly belong 
in the tradition of the many “knowedge-of- 
results” experiments on the basis of which 
Ammons (1956) concluded that knowledge 
of results (or feedback) almost universally 
results in a more rapid learning and a higher 
level of performance. In a study by Murray 
and Deabler (1958) which is directly rele- 
vant to the potential effect of feedback on 
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improvement in clinical judgments, 15 psy- 
chologists attempted to “match” figure draw- 
ings with five diagnostic labels. After making 
his judgments for a set of drawings, the cli- 
nician was shown the correct diagnoses and 
given time to study the materials and to at- 
tempt to understand his errors. Then predic- 
tions were made for the next set of five 
drawings. Over a series of 20 sets of draw- 
ings, the results indicated clear progress in 
learning. Unfortunately the experiment had 
not been directed specifically at the question 
of feedback, and there was not a control 
group to justify the conclusion that it was 
feedback per se which contributed to im- 
provement in performance. Oskamp (1962) 
did find some evidence for the effectiveness of 
feedback, but in his experiment it was given 
only after every 50 trials, and the nature of 
its effect was obscured. 

The first study to be reported here is con- 
cerned with the two ideas presented by Holt 
(1958). The first is what Holt calls “job 
analysis” by which he means that the cli- 
nician should acquaint himself thoroughly 
with the job to be done, that is, the nature 
of the prediction he is to make. Holt believes 
that clinicians too often try to make predic- 
tions concerning criteria of which the pre- 
dictor has only the vaguest understanding. 
In Experiment I below, some naïve clinicians 
were given definitional information concerning 
the criterion, others were not. 

The second proposal to be investigated is 
what is referred to as “feedback.” Holt indi- 
cates that the predictor must have the oppor- 
tunity to discover what kinds of data afford 
indications of the trait to be predicted. In the 
investigations below, J was told whether each 
prediction he made was correct or incorrect 
immediately following that prediction and 
was given a moment to reflect on the informa- 
tion to determine whether such information 
might enable him to adjust his implicit 
hypotheses when making subsequent predic- 
tions. Experiments II and III are directed 
toward elucidating the specific way in which 
feedback operates in the improvement of per- 
formances. Two alternative, but not mutually 
exclusive hypotheses are (a) that feedback 
operates by providing information by means 
of which the subject can adjust his implicit 


hypotheses or (6) that feedback serves a 
motivational function by convincing and re- 
minding the subject that the task is one on 
which improvement is expected and possible, 


EXPERIMENT I 
Method 


The prediction task. It was necessary to devise 
a prediction task which was simple and which did 
not require excessive time. As an initial step, the 
dimensions of anxiety and pleasantness of personal- 
ity were chosen to be predicted. Preliminary work 
indicated that these dimensions were meaningful to 
our subjects and potentially predictable, Target ob- 
jects (Os) were selected from a class of 60 nursing 
students for whom a variety of measures were avail: 
able. Twelve Os were chosen for each of the twi 
variables, 6 representing the extreme high and 6 
extreme low value on each characteristic. Anxiel 
was defined in terms of the six highest and six lo 
est scoring girls on the Pt +K score of the MM! 
The mean score for the six anxious Os was 36. 
and for the nonanxious Os, 19.8. Pleasantness wi 
determined by a peer-nomination technique for 
opposite traits of “most pleasant” and “N 
Pleasant.” The six girls receiving the most nomina- 
tions for “most pleasant” were designated 
“pleasant” and the six receiving the most nomina- 
tions for “least pleasant” were designated “un- 
pleasant.” For the pleasant Os the mean number of 
“most pleasant” nominations was 23.0 and the mean 
number of “least pleasant” votes was 1, The cor- 
responding figures for the unpleasant Os were 1 
and 26.6. 

The materials given to Js from which they were 
to make their predictions were incomplete-sentence 
protocols taken from the Rotter Incomplete Sen- 
tences Blank (ISB)—College Form (Rotter, 1950). 
It was necessary to have test data which would have 
some meaning for Js and which would not require 
excessive time for study. Obviously, it was also 
hoped that the criterion groups would be potentially 
differentiable by means of their sentence completions. 
Again preliminary study supported the use of the 
procedures chosen, for in a pilot study a small group 
of subjects did predict at a better than chance level 
for the anxiety and pleasantness variables, 

The investigations to be reported all assume at 
least a modest validity for the sentence-completion | 
measure. While we have no specific evidence for the 
validity of these brief protocols against the criteria 
chosen, a review of the literature justifies some con- 
fidence in validity of Sentence-completion measures 
for traits such as anxiety and sociability (Sechrest, 
in press). 

For each O, four 5X 8 inch data cards were pre- 
pared. Ten of O’s ISB completions were typed on 
each card with Card 1 consisting of every fourth 
completion beginning with the first; Card 2 of every 
fourth completion beginning with the second, etc. 
Since there were 4 data cards for each of the 24 Os, 
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there were 96 data cards which were divided into 
four groups of 24 each. The four groups of cards 
were formed first by dividing the cards on the basis 
of the characteristic of O to be predicted. The 
resulting two groups were further divided into two 
more groups by putting Data Cards 1 and 2 to- 
gether to form one group and Data Cards 3 and 4 
together to form another group. Half of the Js made 
their predictions from Data Cards 1 and 2; the 
other half from Cards 3 and 4. Each J, although 
making 24 predictions, actually made 2 separate 
predictions for each of 12 different Os. It should 
be pointed out that Js were not told that Os would 
be repeated during the prediction series. They were 
led to believe that they would be making predictions 
about 24 different Os. 

Procedure. The experimental characteristics of the 
groups were established on the basis of the two 
types of predictions to be made and the two inde- 
pendent variables, information and feedback. The 
main experimental groups consisted of Js predicting 
either anxiety or pleasantness. Each group was sub- 
divided into groups of Js receiving (a) information 
plus feedback, (b) information but not feedback, 
(c) no information but feedback, and (d) no infor- 
mation and no feedback. This resulted in eight 
groups of Js, 

The information given to the appropriate groups 
of Js consisted of: (a) a definition of anxiety ac- 
companied by six examples from the Pt scale of 
statements which anxious persons might make, and 
six statements which would be endorsed by non- 
anxious persons, or (b) a brief description of social 
pleasantness and an explanation of the voting pro- 
cedure employed to get the criterion data. The 
information was given to J before he made any 
of his predictions. Feedback consisted of telling J 
whether he was right or wrong following each 
prediction, 

Upon entering the experimental room, each J was 
seated at a small table, opposite and facing the 
experimenter (E), but separated by a screen pre- 
venting visual observation so the J could see neither 
E nor the materials, The following instructions were 
then read, as appropriate, to J: 


This is a study to test the ability of college 
students to predict whether a person is (anxious 
or nonanxious; pleasant or unpleasant) from the 
results of a sentence completion test taken by 
the person. 

You will be given a card on which appear ten 
sentences, The underlined words are those origi- 
nally given to the person, and the remainder of 
each sentence indicates the way in which the 
person completed the sentence. You will be given 
45 seconds in which to read and study the sen- 
tences, and at the end of this time I will ask 
for your prediction. Please do not give your pre- 
diction until I ask for it. (After you have made 
your prediction, I will tell you whether you are 
“right” or “wrong.” You will then be given an 
additional 15 seconds in which to study the sen- 
tences in the light of this information. You are 


to use this information to help you to make more 
accurate predictions in the future.) (You will then 
be given an additional 15 seconds in which to 
study the card.) You will then be given another 
card, from which you will make another similar 
prediction. This process will be repeated until 
you have made 24 predictions from 24 different 
sets of sentences. All of the individuals are female 
and are in their first year of nurse’s training at a 
large hospital. 


Following these instructions, Js in the information 
groups were given the appropriate definitions and 
information about the criterion variables. The ap- 
propriate set of cards was then shuffled, by J, and 
placed face down behind the screen. Then each card 
was, in turn, handed to J. The J was given 45 
seconds to study the sentences and then was asked 
for his prediction. Then, whether given feedback or 
not, all Js were given an additional 15 seconds to 
study the card, so that each J spent a total of 60 
seconds on each card. Preliminary study indicated 
that the period was sufficient and near the maximum 
that Js could be kept at the task. 

The Js’ responses were recorded as either correct 
or incorrect. The score was simply the total number 
of correct predictions. 

Judges. The Js were 96 male undergraduates en- 
rolled in introductory psychology. One random order 
of experimental conditions was preestablished, and 
as each J appeared he was placed in the appropriate 
group. A total of 12 Js was run in each of the 
eight experimental groups. 


Results 


The means and standard deviations for all 
groups are given in Table 1. The means for 
these three groups indicate a trend in the 
expected direction. Each group receiving feed- 
back has a higher mean score than its cor- 
responding no-feedback group, and each in- 
formation group has a higher mean than its 
corresponding no-information group. Two ap- 
proaches were taken to the statistical analysis 
of the data. An analysis of variance yielded 
a significant effect (F = 4.72, df 1, 88) only 
for the triple interaction of Feedback x In- 
formation X Traits predicted. Inspection of 
the means in Table 1 reveals that informa- 
tion seems to be the more important factor 
in predicting anxiety while for pleasantness 
feedback is relatively more important. It will 
also be noted that for both variables feedback 
plus information yields the highest mean 
performance, and no feedback plus no 
information yields the poorest scores. 

A second analysis was performed by testing 
the effects of feedback—no feedback and in- 
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TABLE 1 


THE MEAN AND STANDARD DEVIATION OF PREDICTIVE 
Accuracy SCORES FOR EACH GROUP 


Variable predicted 
Condition Anxiety Pleasantness 
Feedback 
Information 
M 14.25 13.83 
SD 1.88 1.70 
No information 
M 13.42 13.17 
SD 2.71 1.75 
No feedback 
Information 
M 13.92 13.00 
SD 2.46 2.45 
No information 
M 12.75 12.00 
SD 1.88 1.35 


formation—no information against the expecta- 
tion of a chance performance, that is, 12 
correct predictions out of 24. Table 2 pre- 
sents the results of the four chi-square tests 
which ensued from the analysis. The results 
clearly indicate that the .5 accuracy level 
(i.e, chance) is untenable as a hypothesis 
for subjects serving under either feedback or 
information conditions, but it is quite tenable 
for subjects who had either no feedback or 
no information, 

There are several limitations in the investi- 
gation which might have attenuated the mag- 
nitude of the effects observed. It is obvious 
that the prediction task was quite a demand- 


TABLE 2 


CHI-SQUARE ANALYSIS FOR EFFECTS OF EXPERIMENTAL 
VARIABLES ON PREDICTIVE Accuracy Across 
Two TRAITS 


Greater Less than 
than chance chance 
Group accuracy accuracy X: 
Feedback 36 (24) 12 (24) 12.00* 
No feedback 26 (24) 22 (24) 33 
Information 36 (24) 12 (24) 12.00* 
No information 26 (24) 22 (24) 33 


* p <.001. 


ing one, for across all conditions the accuracy 
scores were generally low. There are three 
reasons the scores might have been so low. 
First, Js may not have been particularly well 
motivated to improve their predictions and, 
thus, may not have made use of information 
or feedback given to them (see Experiment 
II). Second, the 10-sentence ISB protocols 
may not have provided a sufficient basis for 
the prediction of the criterion variables, 
Third, the measurement operations involved 
in the criterion variables may have been in- 
adequate, resulting in an unnecessarily dif- 
ficult prediction task. At the present time 
we tend somewhat to discount the latter 
two difficulties, Four reasonably well-trained 
graduate students who have _performed the 
task have done quite well (X = 18 correct 
for anxiety) and have not found the problem 
to be a particularly unusual or difficult one. 


EXPERIMENT II 


One deficiency of Experiment I was that 
while it did demonstrate that feedback re- 
sulted in a more accurate performance, it did 
not demonstrate that feedback improved per- 
formance from trial to trial. The possibility 
existed that the feedback groups started off 
better, perhaps by reason of having greater 
interest in the task. In view of the relatively 
small number of judges employed in the ex- 
periment and the anticipated fluctuations of 
trial by trial scores, it was decided to do an 
additional experiment. However, during the 
course of Experiment I a question also arose 
concerning the motivation of all the judges, 
for it was supposed that if the motivation 
were not sufficiently high the judges might 
not care enough about the experiment to want 
to take advantage of the feedback in order 
to improve their accuracy. Therefore, it was 
decided in Experiment IT to incorporate an 
external incentive in order to determine 
whether motivational level and feedback 
might interact to produce a better per- 
formance, 

Another aspect of prediction as a clinical 
task was called to the attention of the in- 
vestigators by consideration of the total clini- 
cal situation. For the most part, the predic- 
tions that clinical psychologists try to make 
involve estimates about the absolute level in 
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a given subject of the trait or characteristic 
they are trying to predict. That is, they try 
to say just how high, on some hypothetical 
scale with an implied zero point, the subject 
stands on some such trait as anxiety or a 
characteristic such as favorability of response 
to treatment. Such judgments are compara- 
ble to a psychophysical method, the method 
of single stimuli, used for scaling such things 
as brightness of lights. 

In actuality clinical psychologists may 
often be setting themselves an unncessarily 
difficult task, for if one examines the uses 
to which many of their judgments are put, 
it may be found that comparative or, as they 
are called here, differential judgments are 
more appropriate. For example, take the 
judgment of a clinician about the risk of 
suicide for a given patient and his subsequent 
recommendation that the patient either be 
put on suicidal precautions or not be put on 
suicidal precautions. In some degree all dis- 
turbed persons must be regarded as consti- 
tuting a suicidal risk, but to keep all such 
persons under close surveillance is impossible. 
It is likely that the number of persons on 
suicidal precautions typically approximates a 
constant, and that for any patient, the ques- 
tion of his being on the list depends upon 
the risk he constitutes relative to the total 
patient group. It is altogether likely that the 
above argument holds also for the assignment 
of inpatients to treatment by psychotherapy. 
Whether or not a patient receives psycho- 
therapy depends very little on the absolute 
probability of his response to it; the rele- 
vant issue is whether there are more worthy 
candidates. 

While it is quite difficult to get reliable 
judgments of many characteristics on an 
absolute scale, very subtle distinctions may 
be quite regularly made when patients are 
compared with each other. Many, probably 
most patients can be accurately described 
as “very anxious” or as “having conflicts in 
authority relations,” and it is important to 
distinguish among them. What we wish to 
do, in a psychophysical sense, is to pro- 
vide anchor points which will facilitate the 
judgments which need to be made. 

Therefore, Experiment II provided for two 
judging tasks or “types” of predictions. One 


is the standard, absolute kind of judgment 
in which J has to say pleasant or unpleasant. 
The second is a differential prediction in 
which J has to say, in this instance, which 
of two protocols (representing two individu- 
als) most probably was given by a pleasant 
person. 

The design of Experiment II can be seen 
clearly in Table 3. Half of the Js made dif- 
ferential and half absolute predictions; half 
of each of those groups received a monetary 
incentive, and half did not; and half of the 
Js received feedback, and half did not. Each 
J made 24 predictions in the manner de- 
scribed previously except that the differential 
group looked at 24 pairs of protocols rather 
than 24 single ones. Again the protocols were 
10-item incomplete sentence protocols. In this 
case, following on the interaction in Experi- 
ment I, which suggested that prediction of 
social pleasantness was relatively more im- 
proved by feedback, that was the criterion 
to be judged. The incentive offered was 10 
cents for every prediction correct over 12 
and an additional 10-cent bonus for every 
correct prediction over 18. 

The dependent measure was the number 


TABLE 3 


MEAN NuMBER OF CORRECT PREDICTIONS IN BLOCKS 
or EIGHT TRIALS FOR EXPERIMENT II 
(16 subjects/cell) 


Block of trials 


1-8 9-16 17-24 
Differential prediction 
Feedback 5.25 5.38 4.88 
Incentive 
No feedback 400 3.94 3.81 
Feedback 3.50 4.25 4.88 
No incentive 
No feedback 438 3.94 4.00 
Absolute prediction 
Feedback 4.25 450 3.75 
Incentive 
No feedback 3.94 419 4.38 
Feedback 4.715 444 3.94 
No incentive 
No feedback 4.44 3,88 3.81 
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of correct predictions by each J, accumulated 
over blocks of eight trials. Thus, there were 
three scores for each J, and, if improvement 
in accuracy took place, the scores should 
increase from Block 1-8 to Block 17-24. 

Judges for Experiment II. Once again Js 
were volunteers from introductory psychol- 
ogy classes, in this case from two large uni- 
versities. The three independent variables at 
two levels each produced eight separate 
conditions. There were 16 Js serving each 
condition, a total of 128 in all. 


Results 


The mean scores for the various experi- 
mental groups in blocks of eight trials are 
given in Table 3. For the independent vari- 
ables, an analysis of variance showed that 
only feedback produces a significant effect 
(F = 9.33, df 1, 120; p< .01). Reference 
to Table 3 shows that those subjects serving 
under feedback conditions performed in a 
manner superior to those who did not receive 
feedback. Neither type of prediction nor the 
Occurrence of an incentive, however, pro- 
duced a significant effect on the accuracy 
of the judgments. There was a significant 
triple interaction (F = 7.17, df 1, 120; 
= .01) between type of prediction, feed- 
back, and incentive. Inspection of Figure 1 
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reveals that the subjects serving in the 
feedback-incentive-differential condition per- 
formed very much better than subjects in 
any other condition. 

Taking into account the blocks of trials, 
there is no main effect for trials, that 
is, across all groups there was apparently 
no overall improvement from the first block 
of eight trials to the last block of eight trials. 
There are three triple interactions involving 
blocks of trials, all significant beyond the 
.01 level. Figure 2 shows the interaction be- 
tween feedback, type of prediction, and trials 
(F = 4.94, df 2, 240; p< .01). The most 
outstanding feature of the graph is the fact 
that the results for the feedback differential 
group are in the opposite direction from the 
trend for the feedback absolute group. Those 
subjects making differential predictions who 
received feedback improved their perform- 
ance over the three blocks of trials. Those 
subjects making absolute predictions, in spite 
of the fact they were receiving feedback, 
became worse over the three blocks of trials. 
The general trend for the other groups not 
receiving feedback was also to become worse. 

The interaction between incentive, type of 
prediction, and trials (F = 6.12, df 2, 240; 
$ < 01) is presented in Figure 3. Here, once 
again, there are rather sharp and opposite 
trends involving two of the groups. The 
group making differential predictions tended 
to improve over the three blocks of trials 
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when they were not receiving an incentive. 
The group making absolute predictions got 
worse over the three blocks of trials when 
they were not receiving an incentive. The 
trend for the other two groups is somewhat 
similar, showing a fairly substantial decline 
from a high level performance on the second 
block of eight trials to the third block of 
eight trials. Both these groups, it will be 
noted, were receiving an incentive. 

Finally the interaction of feedback, incen- 
tive, and trials (F = 5.03, df 2,240; p < .01) 
is diagramed in Figure 4. This very com- 
plex interaction is quite difficult to interpret, 
the largest arithmetic discrepancy appearing 
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on the second block of eight trials between 
the feedback incentive group and the no 
feedback, no incentive group. The feedback 
incentive group shows an early improvement 
in its performance and then a very substan- 
tial drop on the last block of eight trials, 
whereas the no feedback, no incentive group 
shows a general decline in ability to make 
the kind of prediction required. 

In the above experiment, the importance 
of feedback in facilitating accuracy of clini- 
cal predictions is clearly demonstrated. The 
effect produced by feedback is large, and its 
importance seems to be about equal in combi- 
nation with any one of the other two inde- 
pendent variables. However, the best per- 
formance occurs in a group which has the 
advantage of all three of the independent 
variables which were thought, on an a priori 
basis, to facilitate prediction. When a group 
is not only receiving feedback, but is oper- 
ating with an incentive and is making dif- 
ferential as opposed to absolute predictions, 
its performance is superior to that of any 
other group. 

An admittedly surprising and somewhat 
disappointing outcome of Experiment II was 
the failure to find a significant interaction 
between feedback and blocks of trials which 
would suggest that subjects receiving feed- 
back improved over trials. The fact that feed- 
back was significant only as a main effect 
and in complex interactions suggests the pos- 
sibility that feedback might operate in some 
manner other than as a source of informa- 
tion leading to an improving level of per- 
formance. One possibility is that feedback 
may operate as a cognito-motivational vari- 
able. That is, when a subject is receiving 
feedback in a prediction situation, he is led 
to believe that the task is an important one 
and that improvement in his performance is 
possible. On the other hand, for a subject 
not receiving feedback it may be easy to 
conclude that the task is so simple that it 
requires no particular expenditure of energy 
on his part. Thus, the subject not receiving 
feedback may perform the tasks with the 
investiture of considerably less attention and 
energy than the subject receiving feedback, 
who knows that the task is not an easy one 
but that it is presumably possible to do, It 
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will be noted that in both of the experiments 
reported all subjects were required to spend 
an equal amount of time in the accomplish- 
ment of the task although it cannot be deter- 
mined that they used their time equally well. 

If, however, feedback does not serve an 
informative function but serves only a moti- 
vational or attitudinal purpose in the experi- 
ment, then it should not matter how feedback 
is actually given. In Experiment III to be 
reported below, the relationship between the 
response of the subject and the occurrence 
of feedback was systematically manipulated 
in an attempt to determine whether the effect 
of the feedback is reduced when its informa- 
tional function is minimized or eliminated. 


EXPERIMENT JIT 
Method 


The same materials as were used for Experiments 
I and IL were employed in Experiment III. How- 
ever, since the results for differential prediction in 
Experiment II were not clear-cut, the procedure was 
simplified by requiring only absolute predictions in 
Experiment III. Experiment IIT consisted of four 
conditions of feedback: 

Condition I, Feedback. The Js were given feed- 
back concerning the accuracy of their predictions 
in the manner previously described. 

Condition II. Random Feedback. The Js were 
given the same amount of “feedback” as in the 
feedback condition, but it was administered ran- 
domly with respect to the accuracy of the subject’s 
actual predictions. Thus, on a random, predeter- 
mined schedule, the subject was told that his pre- 
diction was right or wrong without respect to his 
actual performance. 

Condition III. Reversed Feedback. The feedback 
was given in the same manner as in Condition I, 
but its direction was reversed, That is, Js were told 
“pleasant” for protocols that actually had been 
elicited from unpleasant Os, and were told “un- 
pleasant” for protocols that had actually been elic- 
ited from pleasant Os, 

Condition IV. No Feedback. Under the Reversed 
condition, if feedback served only an informative func- 
tion, it would be possible for the subjects to achieve 
as high accuracy as under the feedback condition 
simply by reversing the direction of their predictions, 
However, feedback that is reversed could prove dis- 
rupting to the subject’s performance if it requires 
him to reverse a fairly strong initial disposition. 
If feedback has no informational value and only a 
motivational value, then performance under reversed 
feedback should be equally as high as under feed- 
back since implicit in the notion of the non- 
informational value of feedback is the idea that the 
subject will ignore it for purposes of decision 


making. In condition IV, subjects responded under | 
the standard no feedback conditions. | 
Judges. Subjects in Experiment III were run by 
two different Es, both female. One E ran 11 subjects _ 
in each of the four conditions in the spring quarter 
of the academic year. The other Æ ran 10 subjects 
in each of the four conditions in the winter quarter 
of the following academic year. Thus there were 21 
subjects in each of the four conditions, a total of 
84 Js in all. Once again, all Js were volunteers from 
introductory psychology classes, 


Results 


In Table 4 are presented the means for 
the four experimental conditions given in 
blocks of eight trials. 

An analysis of variance showed a highly | 
significant (F = 7.83, df 3, 76; p < .01) main 
effect for the experimental conditions. Ref- 
erence to Table 4 shows that the feedback 
condition produced the highest level of ac- 
curacy, followed fairly closely by the random- 
feedback condition. Reverse feedback and 
no feedback are inferior to the other two 
conditions. There was no significant differ- 
ence between results for the two Es, but there 
was a significant Conditions x Experimenters 
interaction (F = 4.15, df 3, 76; p<.05). | 
However, the interaction will be ignored 
here, since the E variable is completely con- 
founded with the particular experimental | 
population from which Js were drawn and 
the time of the year at which the experi- | 
mental data were collected. It cannot be | 
determined which differences between the two 
Es produced the significant interaction. | 

Once again, there was no main effect sig- 
nificant for the variable of blocks of trials, 
thus again suggesting that there is no overall 
improvement in performance across trials for 
the four experimental groups. Moreover, on 


TABLE 4 


MEANS ror Four EXPERIMENTAL CONDITIONS BY 
Brocks or Eicur TRIALS 


Trials 
Condition 1-8 9-16 17-24 | 
Feedback 4.52 4.95 5.67 
Random feedback 5.14 471 4.43 
Reversed feedback 4.48 4.00 4.10 
No feedback 4.29 3.76 4.38 | 
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this occasion there were no significant inter- 
actions involving the blocks of trials variable. 


DISCUSSION 


Perhaps the most general and important 
conclusion to be drawn from the experiments 
reported is that feedback is an important 
variable determining accuracy of clinical pre- 
diction but that the information contained in 
the feedback may be of less importance than 
the mere occurrence of feedback. The failure 
to obtain clear-cut evidence for improvement 
in performance across trials in Experiments 
II and III taken with the similarity in per- 
formance of the feedback and random feed- 
back groups in Experiment III suggests 
strongly that a paramount effect of feedback 
is on the level of motivation or interest of 
the subject in the experimental task. Feed- 
back may facilitate performance by indicating 
to the J that the task is an important one 
about which he should be concerned and by 
suggesting that improvement in his perform- 
ance is possible if he will only pay attention 
to the material with which he is given to 
work, 

It must be noted that the prediction task 
we have used is quite a difficult one and, for 
college sophomores at least, the asymptote 
for accuracy is at a fairly low level. Perhaps 
in subsequent investigations if a prediction 
task is used which is easier, in the sense of 
permitting a higher level of accuracy, it will 
be demonstrable that the effects of feedback 
accrue for a reasonably long time in a train- 
ing series. Certainly what has been suggested 
concerning the necessity for feedback in 
clinical prediction situations supposes that 
improvement in clinical prediction will take 
place over a long period of time, perhaps 
even among very expert clinicians. Even 
though previous investigations have often led 
to such a conclusion, it would be disappoint- 
ing to discover that all the improvement in 
predictive accuracy that is possible occurs 
during the first two or three psychology 
courses that students take. 

On the other hand, even if the effect of 
feedback is only motivational, its importance 
in actual clinical situations is not to be 
disparaged. It might very well be that if, 
in clinical situations, there were more sys- 


tematic feedback available to clinicians about 
the outcomes and accuracy of their predic- 
tions, a greater intellectual interest in clinical 
prediction problems could be elicited along 
with a generally higher level of accuracy. If 
clinical psychologists are not firm in their 
belief that their predictive efforts are im- 
portant, or are to be utilized and evaluated, 
disinterest in and neglect of these skills 
becomes quite understandable. No matter 
what the level of the psychologist, the sys- 
tematic feedback of information about the 
people who utilize and evaluate his predic- 
tions cannot help but be of interest and 
importance to him. 

The slight but apparent superiority in 
Experiment III of the feedback to the ran- 
dom-feedback group as well as the distinct 
inferiority of the reverse-feedback group 
suggests that feedback does have some in- 
formative function. If feedback served only 
to develop the interest of J in his task, then 
reversing the feedback should not be detri- 
mental to his level of performance. However, 
in this experiment, it was shown that re- 
versing feedback involves a definite disrup- 
tion. It seems quite likely that for the kind 
of prediction involved in this particular task 
Js have some initial bias concerning the rela- 
tionship between sentence completions and 
social pleasantness, but, as is indicated by 
the performance of the no feedback group, 
that bias does not permit any substantial 
level of accuracy in prediction. However, the 
requirement in the reverse feedback group 
that the bias be ignored, in fact its contradic- 
tion by the experimental feedback, involves 
a disruption which prevents J from using 
whatever information he obtains from the 
feedback simply to reverse his predictions 
and achieve a higher level of accuracy. 
Within the limits of this experimental situa- 
tion and the number of trials employed, feed- 
back is of value only when it does not con- 
tradict the initial assumptions of J about 
what it is he is trying to predict and how 
he is to go about it. 

It is of interest that neither the incentive 
manipulation nor type of prediction produced 
a significant main effect on the accuracy with 
which the predictions were made. Apparently, 
within the limits of the small incentive used 
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in Experiment II, an incentive to do well or 
to improve the accuracy of one’s predictions 
is not of greater importance than whatever 
inherent incentives are operating in the ex- 
perimental situation, for example, desire to 
please E, desire to enhance self-esteem, etc. 
Moreover, differential predictions do not au- 
tomatically lead to greater level of accuracy 
than absolute predictions. Each of these vari- 
ables seems to have its effect only in con- 
junction with the other and with the occur- 
rence of feedback. Then their influence seems 
marked. 

The interpretation of triple interactions is 
always risky, but there are a few points about 
the interactions reported in Experiment II 
that seem reasonable. First, the combination 
of feedback and a differential prediction task 
does lead to increased accuracy across trials, 
and the differential prediction task may result 
in improvement across trials when there is no 
incentive. But what is clearer is that absolute 
prediction tasks yielded either no improve- 
ment or decrement in accuracy across trials 
whether there is feedback or incentive or not. 
Further work on the differences between abso- 
lute and differential prediction tasks is clearly 
demanded. 

The interaction portrayed in Figure 4 be- 
tween feedback, incentive, and trials is, as 
has always been noted, quite complex, but 
the clearest and most understandable trend 
is for the no feedback, no incentive group 
to decline rather sharply across the three 
blocks of eight trials. Just why the feedback 
incentive group should have shown an initial 
improvement and then a sharp decline is not 
easy to understand. 

The effect of an external incentive in a 
complex performance situation may not al- 
ways be what would seem to be anticipated 
on an a priori basis. One possibility is that 
the imposition of an external incentive estab- 
lishes a set in the J to expect a substantial 
degree of accuracy and improvement in his 
performance. If, as was the case in this very 
difficult prediction task, the achievement of 
a very substantially high level of accuracy 
is difficult, it is even possible having estab- 
lished an initial set on the part of J that he 
will be able to perform well and acquire a 
worthwhile amount of money may lead to 


discouragement and even hostility when the 
task proves to be too difficult for him. Thus, 
at least in some of the conditions, the drop 
on the last series of trials in accuracy of 
prediction by groups receiving an incentive 
may have resulted from disappointment and 
even “giving up.” 

Although it was ignored in the latter two 
experiments, the variable of preinstruction 
or job analysis as suggested by Holt (1958) 
is an important one for consideration by 
clinical psychologists and is deserving of more 
extensive experimental study. Results of Ex- 
periment I, albeit with naive clinicians, sug- 
gest that instruction about and consideration 
of the nature of the criterion to be predicted 
facilitates the making of accurate predictions, 
Clinical psychologists are generally not as 
well informed as they should be about the 
indicators and signs of many of the variables 
or traits about which they attempt to make 
predictions. For example, one suspects that 
very few clinical psychologists are particu- 
larly well informed about the functions of the 
central nervous system and the effects that 
disruptions can be expected to produce on 
tests. 

Another equally good example, although 
for somewhat different reasons, is the pre- 
diction concerning the response of a patient 
to psychotherapy. The prediction, in the 
abstract, that a patient will or will not 
respond to psychotherapy is very nearly a 
meaningless one and cannot be expected to 
be made with any high level of accuracy. 
Before such a prediction can be made it 
would be necessary for the predicting cli- 
nician to have a complete understanding of 
the kinds of psychotherapy to be applied to 
the patient, the length of the psychotherapy 
he is to receive, and perhaps even a great 
deal about the personality of the individual 
who is to give the therapy. 

Obviously there may be objections that the 
present studies of clinical predictions have 
grossly oversimplified the predictive task, the 
prediction situation, and have used judges 
who are scarcely comparable to clinical psy- 
chologists. But the vast samples of expert 
clinicians which should be available for re- 
search simply are not, and in any case it 
would be wasteful of their time and effort. 


FEEDBACK AND ACCURACY OF CLINICAL PREDICTIONS 11 


An attempt to check major findings on 
smaller samples of clinical psychologists and 
graduate students is being made. In one 
pilot investigation involving eight graduate 
students in clinical psychology, there was a 
clear-cut effect which could be attributable 
to feedback. Unfortunately, because of an 
experimental confounding with order of ma- 
terials used, this effect will have to be 
checked with an additional sample. Nonethe- 
less, even with very difficult kinds of materi- 
als and more sophisticated clinical Js, impor- 
tant effects of the feedback variable are 
evident. Thus it is hoped that findings pre- 
sented here will have a fairly substantial 
generality to different kinds of prediction 
tasks and to different samples of Js. 
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A recent article by Goldberg and Werts (1966) raises questions about the 
reliability of experienced clinicians’ subjective judgments. A 2nd article by 
Sechrest, Gallimore, and Hersch has studied improvement of subjective clinical 
prediction in naive judges as a function of feedback, motivation, and knowl- 
edge of how criterion judgments were made. While all 3 variables obviously 
should improve clinicians’ judgments, they clearly do not always do so. This 
article discusses the need to make more appropriate the kind of motivation, 
feedback, and knowledge of criterion in order to better insure the process 


of learning from experience. 


Two recent articles in this journal (Goldberg 
& Werts, 1966; Sechrest, Gallimore, & Hersch, 
1967) have raised questions about the validity 
of subjective judgments made by clinicians 
working from test data, The purpose of these 
comments is to consider both these articles 
briefly in order to pinpoint their implications 
for current clinical practice and for the training 
of clinical psychologists. 

In essence, the model for the Sechrest, Galli- 
more, and Hersch studies was one in which 
undergraduate students were used as “clinicians” 
and the effect on the accuracy of the clinical 
judgments was studied in relationship to the 
variables of (a) feedback, (b) knowledge of 
how criterion judgments were arrived at, and 
(c) motivation or incentive. Some but not con- 
sistent evidence of the efficacy of feedback was 
found, but to the authors the improvement 
with feedback seemed as attributable to the 
presumed motivation of subjects in the feedback 
condition as it was to the actual feedback. They 
also report some preliminary data which suggest 
stronger evidence of the value of feedback for 
graduate students in clinical psychology (for 
whom high motivation could be assumed), end- 
ing their article at least on a hopeful note. 

On the other hand, the article by Goldberg 
and Werts seems more discouraging. Using a 
multimethod, multitrait approach they selected 
four tests and four experienced clinicians to 
make judgments on four traits. The tests were 
the MMPI, the Wechsler-Bellevue, the Ror- 
schach, and a Vocational History. The traits 
were social adjustment, ego strength, intelligence, 
and dependency. The four clinicians all had their 
PhDs from approved clinical training programs 
plus “considerable postgraduate clinical experi- 
ence.” Goldberg and Werts (1966) summarize 
their findings as follows: 
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The findings indicate quite clearly that the judg- 
ments of one clinician working from one data 
source bear no systematic relationship to those of 
another clinician working from another data source, 
even though both judges are ranking the same 
patients on the same trait. On the other hand 
judgments of diverse traits from the same data 
source do tend to be related [p. 199]. 


The implication from this study is that clinicians 
operating with different sources of data are 
likely to arrive at different conclusions regarding 
the subject. Those working with the same test 
are likely to have biases or halo effects which 
Stretch across traits. The authors recognize that 
the tests were not designed to provide judgments 
on all of these four variables, but state that it 
is a frequent practice to use them for such vari- 
ables and that their findings argue against the 
wisdom of such practice. Broad generalizing from 
either of these studies would be dangerous, but 
at a minimum they are suggestive of severe 
limitations in the overall practice of making 
subjective judgments from test responses and 
raise again the issue of actuarial versus clinical 
prediction (Meehl, 1954), 

The Sechrest, Gallimore, and Hersch study has 
a number of qualifications limiting the generali- 
zation of its results, to which the authors would 
readily agree. These are: (a) the subjects were 
undergraduates with little experience and pos- 
sibly low motivation; (b) test data (10 ISB 
responses) may have been too small for a valid 
Judgment; (c) while knowledge of how the cri- 
terion was determined was given to the subjects, 
this knowledge could only have provided a 
Minimum of information to the subjects; and 
(d) feedback was given with an overall judgment 
for correct or incorrect without knowing whether 
or not the subjects had made their hypotheses 
explicit or whether or not the hypotheses the 
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subjects were relying upon were relevant or ir- 
relevant. If a subject was basing his judgment 
on an irrelevant hypothesis he still had a 50-50 
chance of being told that he was correct, and 
similarly for his incorrect guesses. Similarly, if 
the subject was not consciously aware of the 
basis for his judgment, feedback could provide, 
at best, a most inefficient basis for learning. 
This point will be further discussed later in 
this paper, since it has direct significance for the 
proper use of feedback if learning is to take 
place efficiently. 

The findings of Goldberg and Werts (1966) 
are also limited in their potential generalization. 
The authors themselves are aware of the fact 
that the tests were not constructed or devised 
to provide all of the kinds of judgments required 
in the study. The validity of the instruments 
used for the purpose of making some of the 
trait judgments is clearly in question. Although 
Goldberg and Werts say they recognize this 
point they feel that such illegitimate use of these 
tests is frequent. The question still is open as 
to what the reliability of clinicians’ judgments 
would be if they were using tests only for pur- 
poses for which they had been devised and 
validated, and if they got together to clarify 
definitions and agree upon clear-cut behavioral 
referents for what they were rating. Semantic 
difficulties in the concepts (traits) employed by 
Goldberg and Werts could be as much the source 
of unreliability as clinicians or tests. 

Both common sense and experimental data 
support the notion that if one has a complex 
criterion, affected by many variables, standard 
testing conditions, and a stable population from 
which validation samples can be drawn, then 
actuarial prediction is likely to be superior to 
clinical prediction. The statement would hold 
whether a single test was used, or a battery of 
tests and other information. There does not seem 
to be sufficient evidence to support the comfort- 
ing notion, held by many 10 to 20 years ago, 
that the experienced clinician was a far superior 
computer to an electronic one. But the com- 
puter and the actuarial method cannot work 
satisfactorily if testing conditions are not stand- 
ard, if the subjects being tested are drawn 
from a shifting population, and if the circum- 
stances of testing vary from administration to 
administration. Included in these circumstances 
would be the condition of the patient being 
tested, the effects of the individual examiner, 
the purposes for which the test was being given, 
etc. On the other hand, the clinician is limited 
in the number of variables which he can juggle 
at one time. Frequently he has made little use 


of his experience in being able to deal with 
differences in the significance or meaning of 
responses as a function of the culture the pa- 
tient comes from, the condition of the patient, 
or the influence of other situational factors in- 
cluding his own influence on test responses. Nor 
is it clear that experience literally teaches. At 
least some clinicians seem to be more concerned 
with practicing than with learning. It is rare 
that clinicians make valiant attempts to obtain 
systematic feedback so that they can change 
or improve their procedures. More often they 
are concerned in demonstrating to others and 
perhaps to themselves that their clinical judg- 
ments and the recipes on which they are based 
are valid. 

In a previous publication, the author has sug- 
gested (Rotter, 1963) that one of the main func- 
tions of the clinician in the interpretation of 
tests is the application of informal norms 
which could be used in addition to test scores, 
formulas, interpretation rules, etc. These in- 
formal norms would take into account variables 
that were not already represented by systemati- 
cally acquired data. Such informal norms would 
involve differences in interpretation as a func- 
tion of sex, age, cultural background, educa- 
tional level, socioeconomic level, etc. of the pa- 
tient as well as informal norms regarding the 
effects of testing under varying conditions of 
subject motivation, purpose, place, subject condi- 
tion, etc, Ultimately, as more and more of these 
informal and subjective norms could be made 
objective and formal the clinician would con- 
tinue to make refinements on other variables 
which affect the meaning of test responses, 

For many clinical purposes it is not possible 
to collect sufficient data to set up objective 
formulas for prediction. Sometimes the clinician 
is called upon to make broad descriptions of 
major personality characteristics with no specific 
variables being asked for. Sometimes he is 
asked to answer a specific question for a par- 
ticular psychotherapist, or he has to make a 
guess about a particular patient’s potential for 
making adjustments outside a hospital or on a 
particular job. Where it is possible to supplant 
subjective judgment with superior objective or 
actuarial methods, there seems to be little reason 
not to do so. Where it is not possible, then the 
problem we are faced with is how to most effi- 
ciently train the potential clinician to learn more 
from experience. What he has to learn is not how 
to substitute subjective judgment for objective 
scoring of personality tests, but to use those 
validated methods of scoring and interpreta- 
tion that are available and to improve upon 
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them by taking into account more data rather 
than less. Some of the data he has to take into 
account might be aspects of the test responses 
which are not taken into account by a par- 
ticular method of scoring. However, it is also 
of great importance that he make use of those 
facts about the patient or subject and about 
his testing conditions that are not included in 
the test responses at all. Too often in the past 
the teaching of subjective clinical evaluation 
has concentrated only on internal aspects of 
the tests and neglected what may be the more 
important sources of variance. 

The three variables for improving clinicians’ 
performance studied by Sechrest, Gallimore, and 
Hersch, namely, feedback, motivation, and knowl- 
edge of criterion, have been demonstrated to 
be important variables in learning and per- 
formance in a variety of learning situations, and 
there is no reason why they should not also 
be important variables for learning to improve 
predictions from tests. However, to apply them 
to this learning task we must specify what is 
(a) the relevant motivation, (b) the relevant 
knowledge regarding the criterion, and (c) the 
proper use of feedback to make the process of 
learning most efficient. 

The motivation to prove that one is right 
is hardly the appropriate one for discovering 
what one is doing wrong. The motivation to 
obtain approval from teachers or supervisors on 
internship has its practical justifications but may 
be irrelevant to improving performance in pre- 
dicting significant behaviors of the patient. Ir- 
relevant motivation will direct the student’s 
attention to irrelevant cues. It seems reasonable 
in the light of the lack of established validity 
for the particular uses to which most of our 
clinical instruments are employed, that we need 
to start training graduate students in clinical 
psychology by impressing them not with how 
knowledgeable the experts are, but rather how 
much everyone has to learn in order to achieve 
reasonable prediction. It is an interesting aside 
that many times when a graduate student arrives 
at an internship with a reasonable skepticism 
regarding the validity of the tests in use in 
that clinic or hospital, this is perceived as 
a lack of preparation on his part as a result 
of inadequate training from the university. Often 
the skeptical attitude is negatively reinforced. 
The field of clinical psychology has grown and 
professionalized very rapidly. In many instances 
the clinical psychologist is on the defensive be- 
cause he has been accepted as being able to do 
more than he actually can do. Our knowledge 
has not grown as rapidly as our acceptance 


both by other professionals and the lay public. 
Placed, therefore, in the position where others 
are expecting us to do more than we are 
capable of, many clinicians have reacted to their 
discomfort by trying to prove to themselves that 
they are far more capable than in fact is the 
case. The proper motivation for learning from 


experience is not the approval of teachers and | 


Supervisors, or grades, or in the case of the PhD, 
promotions and acceptance from colleagues, but 
rather a motivation to continuously acquire data 
which can be used to improve the basis on 
which clinical judgments are made. Such a 
motivation starts with the recognition of the 
severe limitations of our present state of knowl- 
edge and of predictions from objective scoring 
and rules and recipes for the interpretation of 
test data. 

What kind of knowledge regarding a criterion 
is likely to be most useful for learning from 
experience? Obviously if a clinician has to im- 
prove his technique as a result of feedback, 
he is better off knowing that the criterion is 
determined by subjective estimate of an un- 
known psychiatrist made under unknown condi- 
tions than knowing nothing at all. However, 
he is not much better off. Similarly, if he 
were predicting suicidal risk, information that 
the patient had or had not committed suicide 6 
months later would be of more value than no 
information at all. However, such criterion in- 
formation would not be of great value if one 
did not know whether or not the patient was 
under observation or confinement during that 
6-month period, or whether serious suicidal at- 
tempts had been made. If there is to be real 
learning from experience, then considerable effort 
must be expended in obtaining adequate criterion 
data for feedback purposes. If it is worthwhile 
to do diagnostic testing, then it is worthwhile to 
do follow-ups. If necessary, students and cli- 
nicians will have to see fewer cases and follow 
them up more carefully. It is also important 
that while other kinds of predictions and obser- 
vations may be made from tests, particularly 
for purposes of training, emphasis should be 
placed upon making the kinds of predictions for 
which good criterion data will ultimately be 
available. 

While it is not part of our present “get-rich- 
quick” orientation towards training and practice 
to do long-term follow-ups, it may well be 
if the profession intends to improve its predic- 
tive techniques that we will need to think in 
terms of diagnostic practicums extending over 
years rather than months. The adequacy of 
follow-up studies should be an important part 
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of the criteria for the recognition and approval 
of adequate practicum and internship facilities. 
In many instances, however, shorter term predic- 
tions regarding reactions to psychotherapy, ef- 
fects of various kinds of other interventions and 
of behavior in situations other than the testing 
situation (ward behavior, work adjustment, co- 
operativeness with the staff, etc.) may also be 
used for direct observation criteria. 

In general, both in training and practice we 
need to use our tests more often to set up 
predictions about events where some reasonable 
criteria can be obtained. Generalized descrip- 
tions of personality which cannot easily be 
related to specific criteria may be of some value 
after the validity of the instruments has been 
well established. At the present time, however, 
there is a greater need for improving our tech- 
niques than there is for simply making general- 
ized statements of low or doubtful validity. 

Finally, we come to the issue of the kind of 
use that should be made of feedback or knowl- 
edge of criterion. If the clinician bases his 
interpretations on holistic impressions and “gut” 
reactions and is told only that he is correct or 
incorrect, he is not likely to obtain clear-cut 
indications of the nature of the necessary changes 
in his interpretive methodology. If feedback is to 
be useful, then the first requirement is that the 
basis for predictions be made as explicit as 
possible. Even where the basis for the predictions 
is explicit and the nature of the feedback is 
behavioral or objective data, it may still take 
a great many instances of studying concomitant 
variations before the errors or the correct bases 
for prediction can be identified. In fact, where 
so many variables are involved (test, personal, 
cultural, and situational), the problem of learn- 
ing from individual experience may be close to 
hopeless. The clinician must be prepared to reg- 
ularly make use of feedback by deciding on 
what are probably the most appropriate changes 
to be made in his basis for prediction and make 
informal tests of these by the use of previously 
collected tests from clinic files or by collecting 
additional data specifically to test his hypothe- 
ses. Normally he would start with some in- 
formal data collection, and finally he would 
collect formal data using the proper controls and 
statistical analysis. 

Again there are clear-cut implications for the 


training of clinical psychologists. Not only is 
it necessary for the clinician, who will ultimately 
develop skill in diagnostic testing, to have train- 
ing in an experimental attitude, but also in experi- 
mental skills. If the clinicians actually involved 
in clinical practice do not continue to do re- 
search, including formal research on the validity 
of their own practices, then it is unlikely that 
much progress can be obtained in the field of 
practice itself for some time. Our only alterna- 
tive to increasing clinical skill by the accumula- 
tion of experience is the establishment of gen- 
eral laws and their complex interactions so that 
applications of general psychological knowledge 
can be made easily to complex practical situa- 
tions. While such efforts may in the long run 
contribute greatly to individual practice it will 
be many years before applications of our gen- 
eral psychological knowledge to complex situa- 
tions can be made with a reasonable degree of 
confidence, 

What has been said here about learning from 
experience in diagnostic testing could also apply 
to learning from experience in psychotherapy 
practice and relates to the current controversy 
on the importance of experimental training for 
clinical psychologists. It seems to this writer 
that training not only in an experimental atti- 
tude and in how to interpret research, but also 
considerable training and experience in how to 
do research will remain a central part of the 
clinical psychologist’s training, at least until such 
time as the validity of clinical practice can be 
demonstrated to be considerably higher than 
it now is. 
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RESEARCH ETIQUETTE IN THE STUDY OF CLINICIAN’S BEHAVIOR 


KENNETH B. LITTLE 


University of Denver 


A critique of research strategy and tactics in the investigation of the processes 
and functions of clinical psychologists is presented using the report of Sechrest, 
Gallimore, and Hersch as an example. It is argued that the continuation of 
studies using college students as an analogue to clinicians contributes little 
to the understanding of such processes or functions and that the methods 
appropriate than those of experimental 


of differential psychology are more 
psychology. 


One if the most important problems of con- 
temporary clinical psychology is the devising of 
methods for improving the performance of psy- 
chologists in their diagnostic roles. Twenty years 
of research have produced study after study 
demonstrating the low validity (as well as relia- 
bility) of predictions based upon psychodiag- 
nostic techniques. The results hold not only for 
“naive” clinicians, that is, college sophomores, 
but also for skilled psychologists working with 
familiar predictive tasks (e.g., Little & Shneid- 
man, 1959). Even in the most encouraging re- 
ports (e.g., Holt, 1958), the validity coefficients 
are not such as to reassure anyone having to 
make predictions about a single case. 

It is conceivable, of course, that any further 
increases in accuracy are impossible; that the 
process of prediction is so complex, so subject 
to confounding by unknown situational variables, 
and existing psychodiagnostic instruments so 
crude, that an asymptote of accuracy has been 
reached, A conclusion of this sort seems to have 
been reached by Meehl (1960) who then figura- 
tively kicks the assessment process while it is 
down by citing survey evidence that the major- 
ity of persons concerned with the treatment of 
psychic ills consider a priori personality evalua- 
tions and predictions of little utility anyway. 

If the conclusion is correct, then clinical psy- 
chology is in an embarrassing position. Through- 
out the United States, thousands of man-hours 
are devoted each day to psychological evalua- 
tions, evaluations which the makers thereof pre- 
sumably think of as bases for valid predictive 
statements. A few thousand more man-hours go 
into the teaching of students how to administer, 
score, and interpret the same instruments used 
by the practitioner. In view of the mass of nega- 
tive research results, an objective observer might 
be justified in arriving at a diagnosis of mass 
hysterical blindness among clinical psychologists. 

An alternative conclusion somewhat more 
palatable to most of us is that the issue is still 
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open; that the research techniques used to date 
simply do not get at the validating evidence 
buried in the clinicians’ experience, This position 
does not assert perfection but only a substan- 
tially higher accuracy rate and utility value for 
clinical predictions than has been presented in 
the evidence to date. One may view this con- 
clusion as a rational belief, an article of blind 
faith, or as a working hypothesis subject to test 
by appropriate research procedures. 

The majority of the proponents of this posi- 
tion (among whom the present writer is an 
uneasy member) would probably concur with 
Holt’s (1958) recommendation that we hold 
in abeyance the ceaseless flow of validation 
studies and get on with the business of improving 


both our instruments and the predictive capaci- " 


ties of those using them. It is this second part 
of this recommendation that has engaged the 
attention of Sechrest, Gallimore, and Hersch 
(1967). 

As would most American psychologists when 
desirous of improving learning, the authors have 
selected operant conditioning for initial consider- 
ation. (After all, generations of rats and pigeons 
and a fair number of humans have devoted 
their careers to proving that reinforcement— 
pellet or knowledge of results—will significantly 
increase the frequency of occurrence of a selected 
response.) They set as their goal the determina- 
tion of the effects of knowledge of results, that 
is, feedback, and of information about the traits 
of concern, on the accuracy of clinical psycholo- 
gists making predictions of personality character- 
istics of others. Their paradigm is college 
freshmen and sophomores guessing whether 12 
individuals are “anxious” or “nonanxious,” 
“pleasant” or “not pleasant” using as cues 10 
ISB items; as “information,” a definition of the 
trait plus sample responses of high and low 
individuals or a description of the characteristic 
and how it was measured; and as feedback, the 


Tue STUDY oF CLINICIAN’s BEHAVIOR 17 


statement to the subject that he was right or 
wrong after each prediction. 

This reviewer read the article with both inter- 
est and distress, the interest stemming from the 
importance of the topic and the ingeniousness 
of the procedure employed; the distress from 
what seems to be a blind repetition of many 
of the mistakes of the earlier validation studies. 

It is commendable, but does not change the 
situation, that the authors recognize that the 
deficits exist. The innocent reader hastily scan- 
ning an article or perhaps reading only the 
abstract is more influenced by analysis-of- 
variance-tables and summarized conclusions than 
he is by modest indications of the limitations 
of the study. Let me elaborate on some of these 
deficits. 

1. Volunteers from elementary psychology 
courses are presented as representative of either 
practicing clinicians or budding clinical psychol- 
ogy students, As a rough guess, no more than 
1% of that undergraduate population would pass 
the screening processes used by graduate pro- 
grams in clinical psychology. On the basis of 
intellectual ability and of motivation alone (both 
significantly related to learning), they should 
be ruled out. The authors do note the problem 
of motivation and attempt to remedy it—unsuc- 
cessfully—with financial inducements. 

2. The amount of “information” provided 
about the characteristics to be predicted is 
woefully limited (cf. the “job analyses” pro- 
vided to the judges in the Menninger studies or 
in the Holzman and Sells, 1954, study). Paren- 
thetically, it is not clear whether the “informa- 
tion” was also provided in Experiments II and 
TII of the present work. Since this is an integral 
part of Holt’s recommendation for an adequate 
design in clinical prediction research, it would 
be highly desirable to have included it (if it was 
not) even though it did not appear as a signifi- 
cant factor in Experiment I. 

3. The number of trials over which the ef- 
fect of feedback was expected to reveal itself 
is remarkably small. Even in so simple a cogni- 
tive task as probability learning, the typical 
number of trials is 150 to 400. In a complex 
task such as prediction of a personality charac- 
teristic using as cues samples of responses to 
the ISB (having an indeterminate but probably 
modest “true” cue value for the trait in ques- 
tion), certainly a greater number of trials than 
24 would seem mandatory. It is quite possible 
that the authors were, in Experiments II and 
III, looking for trends in what was still essen- 
tially a random process, the subjects having not 
yet explored a sufficient number of guessing 


strategies to settle on the most profitable one. 
(An examination of the reliabilities of the 
guesses as represented in the judgments based 
on the first set of ISB responses of the target 
individuals versus the second set might be en- 
lightening in this regard.) A decline in perform- 
ance after an initial improvement (a phenome- 
non which concerns the authors in their feed- 
back-incentive group) is quite common in studies 
of sequential decisions. Subjects apparently try 
one approach, become dissatisfied with the re- 
turns it produces, and sacrifice the assured 
first whereas the second seems (to me) prefer- 
able. 

In short, this reviewer experienced some mild 
surprise that any of the effects tested, including 
the interactions, reached an acceptable level of 
statistical significance. 

Many of the same objections could be raised, 


- for example, to experiments reported in the 


“hard-science” journals of APA, But the history 
of inconclusive research on the processes and 
functions of clinical psychologists is such and 
the problems of sufficient importance that fairly 
rigorous demands on experimental procedures 
might reasonably be made. For two decades, 
much of our clinical research has resembled the 
behavior of the inebriate searching for his watch 
under the street lamp, simply because that was 
where the light was, not where he lost it. Such 
behavior is not necessarily profitless as Kaplan 
(1964) points out, but it seldom provides a 
conclusive answer to the problem being investi- 
gated, It is true that investigations in the clini- 
cal area are expensive, expensive in time, in 
money, in rarity of subjects. But surely we are 
now at the point where if we wish to draw 
conclusions about the behavior of clinical psy- 
chologists we should examine them and not white 
Norway rats or college students, and examine 
them in their native habitats contentedly 
munching away at enormous masses of data. 
There is a second problem reflected in the 
experiments of Sechrest, Gallimore, and Hersch 
which might also be raised. This is one of re- 
search strategy as contrasted with the tactics 
discussed above. The issue is whether a straight 
experimental approach or some combination of 
experimental and differential procedure is most 
appropriate. The authors have selected the first 
whereas the second seems (to me) preferable. 
To expand on this point, the concept of 
“types” of people has gone out of fashion and 
the student in elementary psychology is con- 
scientiously taught that it simply denotes a 
certain interval on an otherwise continuous dis- 
tribution of ability, usually represented by the 
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Gaussian curve. Yet in matter of fact, people 
end up functioning as types. It matters not 
whether ability in accounting, say, is normally 
distributed throughout the population, only 
those who exceed a certain cutoff point of abil- 
ity become accountants. Similarly, only those 
(hopefully) beyond a certain level of intel- 
lectual ability receive PhDs in psychology and 
of those, only a smaller group with certain 
personality or other intangible assets or liabili- 
ties become clinical psychologists. If the ability 
to make accurate personality evaluations, predic- 
tions of personality characteristics from person- 
ality tests, is one of the assets, then the issue 
for clinical psychology is not whether such an 
ability is democratically distributed throughout 
mankind or whether all individuals will profit 
from training in making predictions but rather, 
are there some persons with an appreciable 
degree of the ability and can these individuals, 
or some of them, be made even more skillful. 

In its simplest form, the argument is that 
when one asks if clinical predictions can be 
made with above-chance accuracy or if those 
who make clinical predictions will improve under 
certain training regimes, it resembles a personnel 


selection problem far more closely than it does 
a classical experiment in learning. Clinical psy- 
chology is an applied discipline, as such and 
in psychology’s present state of ignorance about 
the determinants of man’s behavior, the em- 
pirical approach would seem most profitable. 
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SITUATIONAL AND PERSONALITY CORRELATES 
OF PREDICTIVE ACCURACY* 


JERRY R. TOMLINSON 


Drake University 


An experiment was designed to investigate the relationships between predictive 
accuracy and type of interviewer-interviewee contact. 120 undergraduates 
were administered the Adjective Check List and assigned the roles of inter- 
viewer, interviewee (object), and observer. After a short interview, seen by 
the observer through a 1-way mirror, the interviewer and observer predicted 
the object’s responses to the Adjective Check List. Females were predicted 
more accurately than males regardless of condition of contact or sex of judge. 
There was no difference in accuracy of prediction between direct and indirect 
contact with the object. High-accuracy judges obtained higher scores on the 
Order scale of the Adjective Check List and lower scores on the Change and 
Affiliation scales than low-accuracy judges. Under conditions of direct contact, 
higher accuracy was associated with a higher need for achievement, whereas 
under the condition of indirect contact this personality attribute was associated 


with lower accuracy. 


Hammond (1955) has suggested that the 
problem which arises when a clinician acts 
as both an observer and a measuring instru- 
ment of behavior is analogous to a problem 
of measurement in physics, namely, the inter- 
action between the measuring instrument and 
that which is being measured. In psychology, 
the problem created by the interaction be- 
tween the clinician-observer and the patient- 
object is the possibility that the interaction 
may affect the clinician’s observations. Ham- 
mond (1955) suggests that, in order to 
submit this to study, the focus of investiga- 
tion should be shifted “to a point beyond the 
clinician-observer where an observation can 
take place in a noninteractive fashion [p. 
256].” 

One aspect of this problem, the effect of 
direct versus indirect contact between judge 
and object on the accuracy of subsequent pre- 
dictions of the patient-object’s behavior, was 
the subject of a study by Borke and Fiske 
(1957). They failed to find a difference in 
predictive accuracy under the various experi- 
mental conditions of direct interaction, ob- 
servation through a one-way vision screen, 


1 This article is based on a thesis submitted to the 
Department of Psychology, State University of Iowa, 
in partial fulfillment of the requirements for a master’s 
degree. The author wishes to express his appreciation to 
Alfred B. Heilbrun, Jr., under whose supervision the 
investigation was conducted, for his guidance and sug- 
gestions in the planning and execution of this study. 
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listening to a recording of the patient, and 
reading a transcript of the interview. 

Although the variable of sex of judge and 
object has been dealt with in several studies 
(Dymond, 1950; Hathaway, 1956) concerned 
with predictive accuracy, no attempt has been 
made to relate this variable to the type of 
interaction. Neither has an attempt been 
made to determine what personality variables 
might be related to predictive accuracy under 
different contact conditions, 

It was the purpose of this study to investi- 
gate the relationships between predictive 
accuracy and 


1. Direct versus indirect contact between 
judge and object, 

2. Sex of judge and object under the two 
contact conditions, and 

3. Personality of the judge under the two 
contact conditions. 


METHOD 


Predictive task. The Adjective Check List (ACL; 
Gough & Heilbrun, 1965) consists of 300 adjec- 
tives, each item of which S (the subject) either 
endorses as self-descriptive (by checking the adjec- 
tive) or denies as self-descriptive (by not checking 
the adjective). The set of 300 responses thus ob- 
tained was the behavior sample toward which 
predictions were made. 

If predictive accuracy represents the number of 
items that are in agreement (either checked or not 
checked) between the judge’s predictive ACL and 
the object’s self-descriptive ACL, a further problem 
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arose. Heilbrun (1961) found that under standard 
self-descriptive conditions the mean number of ad- 
jectives checked for college males was 95 and the 
mean number of adjectives checked for college 
females was 93. Since the average college student 
leaves over 200 adjectives unchecked, it would have 
been possible to obtain high accuracy scores simply 
by leaving all of the adjectives on the predictive 
ACL blank. Further, it seemed reasonable to assume 
that whenever the judge was uncertain about the 
object’s response, he would respond by leaving that 
adjective blank, As a result, a judge could enhance 
his accuracy score in this manner and it would be 
dificult to discriminate between the accurate and 
the cautious (and probably less able) judge. For this 
reason a third, or “domt know,” category was 
included in the instructions for the predictive ACL’s. 

Procedure. A total of 120 undergraduate students 
(60 males and 60 females) from a large under- 
graduate class in psychology were used in this in- 
vestigation, The Ss were group-administered the 
ACL under standard instructions. They were then 
instructed to sign up in subgroups consisting of three 
persons each (triads). These triads were arranged 
so that each consisted of two judges (one interviewer 
and one observer) and one object. They were 
further arranged so that, keeping the interviewer 
and observer sex the same, all possible judge-object 
sex combinations were obtained. The Ss were in- 
structed that they must be in a triad in which the 
other two members were unknown to them with the 
exception of casual contact in this particular class. 

Each triad was recalled separately at a time 
ranging from 1 to 4 weeks following the group 
administration of the ACL. The interviewer was then 
instructed to find out as much as he could about 
the object in 15 minutes using the method of direct 
interviewing. The observer was instructed to attend 
only the object during the interview. Both the 
observer and the interviewer were told that at the 
end of the interview they would be given a task 
that would indicate how well they had gotten to 
know the object. The object was instructed to 
answer the interviewer’s questions as honestly as 
possible. 

The interviewer and the object were then placed 
in the interview room in full view of a one-way 
vision screen for a period of 15 minutes, during 
which time the interview took place. The observer 
was seated on the opposite side of the one-way 
vision screen where he could both hear and see the 
object and the interviewer. At the end of 15 minutes, 
the interviewer and observer were separated from 
the object and instructed to fill out the ACL as 
they thought the object had done in describing 
himself during the first part of the experiment, 

The predictive accuracy scores were obtained by 
matching each ACL item predicted by the judge 
with the corresponding self-evaluative Tesponse to 
that item made by the object in that triad. The 
final score represented the percentage of all at- 
tempted items which agreed with the object’s 
corresponding self-evaluative responses, 


Differences between classes of judges in the number 
of adjectives for which no predictions were made 
(ie. placed in the “don’t know” category) were 
analyzed in a Sex of Judge X Sex of Object X Type 
of Contact factorial design. There was no main 
effect for type of contact (F < 1.00) nor for sex 
of object (F < 1.00), but there was a significant 
main effect for sex of judge (F = 11.88; df = 1/36; 
p<.005). Male judges (X =44.7) had a larger 
number of adjectives for which no prediction was 
made than female judges (X = 18.0). Therefore, a 
percentage score was employed to remove the effect 
of individual and/or group differences in number of 
predictions attempted. 


RESULTS 


The accuracy scores were analyzed in a 
Type III analysis of variance design (Lind- 
quist, 1953) with sex of judge and sex of 
object as the between dimensions and type 
of contact as the within dimension. 

The difference between accuracy scores of 
judges under direct contact and judges under 
indirect contact conditions was not significant 
(F < 1.00). There was no significant main 
effect for sex of judge (F < 1.00), but there 
was a significant main effect for sex of ob- 
ject (F = 5.30; df= 1/36; p<.05) with 
females (X = 73.21) being predicted more 
accurately than males (X = 67.53). 

The 80 judges were divided at the median 
of their distributions of accuracy scores, and 
t tests of differences between their means on 
15 personality measures scored from the ACL 
(see Table 1 below) were made. None of the 
personality differences between the high- and 
low-accuracy judges was significant at the 
5% level of confidence, and only one was 
significant at the 10% level of confidence. 

In order to examine the possibility that a 
relationship between accuracy scores and per- 
sonality variables might be demonstrable only 
at the extremes of accuracy, an analysis of 
the differences between the mean scale scores 
of the 15 ACL scales between judges with 
the 10 highest accuracy scores and the judges 
with the 10 lowest accuracy scores was made. 
The distributions of scores on 10 of these 
scales had heterogeneous variances and were 
therefore analyzed by the nonparametric 
Mann-Whitney U test. On three scales the 
differences were significant at the 5% level 
of confidence, high-accuracy judges obtaining 
a higher Order score (X = 58.70) than low- 
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TABLE 1 
T Score MEANS AND STANDARD DEVIATIONS or ACL Scares ror Hicu- AND Low- 
Accuracy JUDGES UNDER DIRECT AND INDIRECT CONTACT CONDITIONS 
Accuracy 
High accuracy- Low accuracy- High accuracy- Low accuracy- X contact 
direct contact direct contact indirect contact Indirect contact inter- 
(N = 20) (N = 20) (N = 20) (N = 20) action 
ACL scale Mean SD Mean SD Mean SD Mean SD t 
Achievement 55.1 10.4 48.4 12.8 49.2 9.7 54.8 9.2 2.56** 
Deference 50.8 9.6 52.3 8.4 51.5 vi 48.9 6.6 1.08 
Order 55.0 11.1 51.4 11.1 50.0 9.4 54.9 9.6 1.78* 
Exhibition 49.4 13.2 42.6 11.2 48.3 9.7 49.9 8.5 1,70* 
Autonomy 50.7 10.9 45.3 8.9 46.7 9.0 45.0 11.8 78 
Affiliation 45.2 8.8 48.2 12.3 53.3 11.3 51.9 10.3 .90 
Intraception 49.7 10.1 50.2 11.1 49.8 7.3 53.4 10.6 .70 
Succorance 52.7 13.0 50.1 10.5 50.5 9.1 44.4 8.3 72 
Dominance 49.2 8.5 47.5 9.8 50.0 10.9 56.0 7.0 1.84* 
Abasement 49.8 9.0 52.6 9.0 51.9 8.6 49.2 6.6 1.43 
Nurturance 48.7 10.0 52.4 9.6 52.1 9.3 52.8 7.2 «13 
Change 49.1 10.9 48.5 8.9 49.6 8.2 51.6 9.8 57 
Endurance 53.3 12.4 50.6 11.9 50.3 9.7 52.0 10.0 87 
Heterosexual 45.5 8.7 48.0 10.3 50.2 11.0 52.4 8.0 07 
Aggression 52.8 11.0 47.3 8.3 50.4 9.1 51.9 10.3 1.58 


*p <10. 
*D < 05. 


accuracy judges (X = 47.30), a lower Change 
score (X = 45.00) than low-accuracy judges 
(X = 52.20), and a lower score on Affilia- 
tion (X = 47.80) than low-accuracy judges 
(X = 55.00)—T scores based on college 
population norms with X = 50 and SD = 10. 

The final type of analysis was concerned 
with possible interaction effects between type 
of contact and accuracy in relationship to the 
personality of the judge. Judges were divided 
into four groups on the basis of high or low 
accuracy and direct or indirect contact. 
Table 1 presents a summary of means, stand- 
ard deviations, and ¢ tests of interactions ? 
between contact and accuracy on the 15 ACL 
personality scales. There was a significant 
( < .01) interaction between accuracy and 
type of contact on the Achievement scale 
and near significant (p < .10) interactions on 
the Order, Exhibition, and Dominance scales. 
Thus, under the conditions of direct contact 
with the object, higher accuracy was associ- 
ated with higher need for achievement, order, 
exhibition, and dominance of the judges, 


whereas under the conditions of indirect 
contact these personality attributes were 
associated with lower accuracy. 


Discussion 


The finding that there is a difference in 
accuracy, as a function of the sex of the 
object, is consistent with the prior findings 
of Hathaway (1956) and Dymond (1950). 
Hathaway suggested that females were pre- 
dicted more accurately than males because 
they behave more homogeneously relative to 
the female stereotype than males behave rela- 
tive to their stereotype. The finding of no 
differences in predictive accuracy between 
male and female judges is consistent with the 
findings of Hathaway, but failed to replicate 
Dymond’s results which indicated that female 
judges were more accurate in their predictions 
than were male judges. 

The finding that the accuracy scores of 
judges under direct contact were not signifi- 
cantly different from the accuracy scores of 
judges under indirect contact conditions is 


(mean; — meanz) — (mean; — mean.) 


24 for interaction = 


VS? + NaS? + N:S? + NS2/(Ni +N: +N: + Ny) — 4 Vi/Ni + 1/No-+ 1/N: +1/N4 
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consistent with the findings of Borke and 
Fiske (1957). As the findings concerning 
interaction of personality factors and type of 
contact suggest, placing an individual in a 
noninteractive setting does not improve his 
predictive accuracy and may tend to reduce 
it as a function of the introduction of com- 
peting responses associated with certain 
personality characteristics. 

The results obtained when scores on 15 
personality scales of the ACL for high- and 
low-accuracy judges were compared indicated 
that at the extremes of accuracy, there 
were differences on personality dimensions— 
affiliation, change, and order. 

The need for affiliation has been defined ® 
as the need to seek and sustain numerous 
personal friendships. It is somewhat surpris- 
ing, therefore, to find that individuals who 
were less accurate in predictions showed a 
higher need for affiliation than individuals 
who were more accurate. Frequency of social 
contact with others does not necessarily 
imply more accurate information about their 
behavior. 

The need for change is defined as the need 
to seek novelty of experience and avoid rou- 
tine. One possible explanation for the nega- 
tive relationship between predictive accuracy 
and change is that certain of the correlates 
of the need for change also affect predictive 
accuracy, for example, a reduction in concen- 
tration or inability to attend to one situation 
for an extended time. 

The need for order is defined as the need 
to place emphasis on organization and plan- 
ning of one’s activities. The individual with 
a higher need for order may have improved 
his accuracy of prediction by greater organi- 
zation of his approach to the interview and 
of the information he received as a function 
of that interview. 

When judges under the direct contact con- 
dition were considered separately from judges 
under the indirect contact condition, a signifi- 
cant (p < .05) interaction between accuracy 
and condition of contact was observed on 
the Achievement scale. 

The achievement need is defined as the 
need to be outstanding in pursuits of socially 


* This and the succeeding definitions of needs are 
taken from Gough and Heilbrun (1965). 


recognized significance. One interpretation 
suggested by this interaction finding is that 
when an individual with a higher need for 
achievement is placed in an interpersonal 
situation and certain of the overt Tesponses 
associated with this need are prevented from 
occurring (as in the indirect contact condi- 
tion), he experiences some degree of frustra- 
tion. Certain additional responses are then 
evoked which interfere with the response of 
attending to the object in order to obtain 
information relevant to making predictions 
about his behavior. These competing re- 
sponses may be one or more of the types 
mentioned by Brown and Farber (1951) and 
Child and Waterhouse (1953) which include 
strong habits of responding to frustration 
and aggressive, disorganized responses. 

On the other hand, individuals who had a 
higher need for achievement obtained higher 
accuracy scores when placed in direct-contact 
condition. This is consistent with the expecta- 
tion that these individuals are more com- 
petent in handling social situations in which 
they can structure or manipulate the situation 
in accordance with their needs. 
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A FACTOR ANALYSIS OF 


THE BECK INVENTORY 


OF DEPRESSION * 


T. E. WECKOWICZ, W. MUIR, axb A. J. CROPLEY 
University of Alberta, Canada 


A factor analysis of the Beck Inventory 
of 254 significantly depressed hospital 


of Depression, based on the responses 
patients, has yielded 3 interpretable 


factors. The largest factor was that of affective depression referred to here 


as a factor of “guilty depression.” The 


2 remaining factors were interpreted 


as “retarded depression” and “somatic disturbance.” These 3 factors showed 
some correspondence to the factors found by other investigators. The sug- 
gestion is made that the etiology of various depressed states may be associated 
with different levels of mental functioning. The need for additional factor 
analytic studies, which include behavioral and physiological measures, is 


noted. 


The concept of depression as used by 
psychiatrists is an ambiguous one. It may 
describe a patient’s mood or refer to a 
nosological entity (Lehmann, 1959). Mendel- 
son (1959) has suggested that a heterogene- 
ity of clinical samples has resulted in differing 
psychopathological interpretations of depres- 
sion. Since Mendelson (1959) has reviewed 
the different classificatory systems of depres- 
sion evolved by clinicians, these will not be 
reviewed in the present paper. 

In view of the subjectivity of the descrip- 
tion and classification of depression and other 
illnesses prevalent in clinical psychiatry, at- 
tempts have recently been made to evolve 
more objective methods in the form of 
symptom rating scales and inventories. The 
amount of literature on the subject is con- 
siderable and has been reviewed by Lorr 
(1954) and Cutler and Kurland (1961). In 
order to obtain objective classifications of 
psychiatric symptoms, several factor ana- 
lytic studies of patient scores on rating scales 
and symptom check lists have been carried 
out. In several of the studies, heterogeneous 
groups of psychiatric patients have been used. 
In a sample of chronic psychiatric patients 
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Degan (1952) reported two correlated first- 
order factors, that of anxious depression and 
that of neurasthenia. He also obtained a 
second-order factor of retarded depression 
which combined these two factors. In several 
studies using orthogonal or oblique rotations 
on samples of patients which appeared to be 
predominantly schizophrenic, a bipolar factor 
of retarded depression mania and a factor 
of agitated depression have been found (Lorr, 
1957; Lorr, Jenkins, & O’Connor, 1955; Lorr, 
O’Connor, & Stafford, 1955; Wittenborn, 
1951; Wittenborn & Holzberg, 1951). Wit- 
tenborn and Bailey (1952) carried out a 
Q-factor analysis of a group of patients diag- 
nosed as having involutional psychosis and 
found a factor indicating a syndrome of 
agitated depression. 

There have been several factor analytic 
studies of scales of depression reported in 
the literature. Comrey (1957) obtained 17 
orthogonal factors from the depression scale 
of the MMPI, only 2 of which could be 
considered as related to depression. Hamilton 
subjected the scores of depressed patients on 
a rating scale, designed by himself, to factor 
analysis (Hamilton, 1960; Hamilton & White, 
1959). He identified four factors which he 
called: “retarded depression,” “agitated de- 
pression,” “anxiety reaction” and “psycho- 
pathic depression.” The first two factors cor- 
responded to two conventional diagnostic 
categories, the third was related to the out- 
come of treatment, and the fourth was indica- 
tive of some degree of character disorder. An 
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important factor analytic study was carried 
out by Grinker, Miller, Sabshin, Nunn, and 
Nunnally (1961). These authors factor ana- 
lyzed the intercorrelations of patient scores 
on a “feelings and concerns” check list and 
also on a “current behavior” check list. They 
obtained five factors from the “feelings and 
concerns” check list. The first factor was 
considered to be a general factor of depres- 
sion while the remaining four factors could 
be interpreted as representing different ego- 
defense mechanisms dealing with feelings of 
depression, Ten orthogonal factors were ex- 
tracted from the “current behavior” check 
list. From the factor scores of the subjects on 
all 15 factors of both lists, four “factor pat- 
terns” were derived which characterized four 
types of depression described by the authors 
as “retarded empty,” “anxious,” “hypo- 
chondriacal,” and “angry” depression. Overall 
(1962) factor analyzed a 31-item manifest 
depressive rating scale, developed by himself 
and administered to a group of depressed pa- 
tients. He reported seven orthogonal factors 
which he defined as: “depression in mood,” 
“guilt,” “psychomotor retardation,” “anx- 
iety,” “subjective experience of impairment 
in functioning,” “abnormal preoccupation 
with physical health,” and “physical response 
to stress.” These factors were the same as 
those obtained by the author from a group 
of schizophrenic patients, A factor analytic 
study using ratings by psychiatrists on a 
group of selected depressed patients was car- 
ried out by Friedman and his collaborators 
(Friedman, Cowit, Cohen, & Granick, 1963). 
Four orthogonal factors were extracted which 
may be characterized as factors of mood dis- 
turbance, retardation, somatic disturbance, 
and demanding hypochondriasis. 

Kiloh and Garside (1963), using a check 
list of depressive symptoms, found two fac- 
tors: a general factor of depression and a 
bipolar factor which differentiated between 
endogenous and exogenous (neurotic) depres- 
sion and was associated with the outcome of 
electroconyulsive therapy. In a subsequent 
study, Carney, Roth, and Garside (1965) 
extracted, in addition to the above-mentioned 
two factors, a third, identified as a “paranoid 
psychotic” factor. Finally, Rosenthal and 
Klerman (1966) obtained, in their factor 
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analysis of depressive symptoms, a gener: 
factor of endogenous depression. There was 
a high correlation between the factor scores 
on this general factor and the factor scores on 
Hamilton’s first factor, and also those on the 
bipolar factor of Kiloh and Garside. 

All these factor analytic studies, with the 
exception of Comrey’s, involved ratings by 
psychiatrists or psychiatric nurses. A self- 
report inventory, measuring intensity of de- 
pression and having satisfactory reliability 
and validity, has been reported by Beck, 
Ward, Mendelson, Mock, and Erbaugh 
(1961). This inventory consists of 21 sub- 
scales which measure different groups of de- 
pressive symptoms. For each subscale, the 
patient selects a statement describing his 
condition in a multiple-choice situation. The 
21 subscales are listed in Table 1. 

Cropley and Weckowicz (1966) have re- 
ported a preliminary factor analytic study, 
using the maximum likelihood method (Law- 
ley & Maxwell, 1963), of this inventory on 
a relatively small number of subjects. The 
purpose of the present study is to examine 
the factorial structure of Beck’s scale using 
a much larger sample and to compare the 
factors obtained with those of other studies. 


METHOD 
Subjects 


The Beck inventory was administered to all newly 
admitted patients in the psychiatric unit of a large 
general hospital during a period of approximately 
16 months. The method of administration and 
scoring of the inventory followed that described by 
Beck and his collaborators (Beck et al, 1961). Of 
the total number of 391 patients who were tested, 
254 scored 17 or more. The score of 17 was reported 
by Beck as the cutoff point indicating a clinically 
significant degree of depression. Only the 254 patients 
who reached this criterion were included in the 
factor analysis. This procedure was adopted, because 
the inclusion of nondepressed patients could have 
resulted in obtaining a different factorial structure 
from that obtained from a sample including only 
depressed patients. In particular, there was a pos- 
sibility of obtaining a large general factor of depres- 
sion, differentiating only depressed from nondepressed 
patients and indicating the depth of depression rather 
than the description of naturally occurring clusters 
of depressive symptoms, as was the intention of this 
study. There were 180 females and 74 males in the 

2A factor analysis based on the total sample of 


391 patients to whom the Beck inventory was ad- 
__ministered resulted in a general factor accounting 
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sample with a mean age of 39.3 years (SD 13.7, 
range 15-77 yrs.). 


Factor Analysis 


An initial principal-axis factor analysis using the 
Householder method (1938), with unities in the 
diagonal of the correlation matrix, was carried out. 
The purpose of this procedure was to establish the 
number of significant factors according to Kaiser’s 
(1960) criterion of retaining those factors with 
eigenvalues of one or greater. This resulted in the 
retention of eight factors upon which the initial 
communality estimates were based. In view of the 
relatively small number of variables being factored, 
in order to minimize the bias associated with having 
unities in the diagonal of the correlation matrix, 
the obtained communality estimates were inserted 
into the diagonal and the matrix was refactored. 
This procedure was repeated until the communalities 
became stable. The stabilization, applying a con- 
vergence criterion of +.02, occurred on the fourth 
iteration, There was a relatively sharp decrease in 
the size of the eigenvalues between factors three and 
four in the final principal-axis factor matrix. This 
suggested that one might expect there to exist only 
three interpretable factors. The obtained principal- 
axis factor matrix of eight factors was rotated using 


for 77.8% of the common variance and a doublet 
accounting for 13.3% of the common variance, 


the varimax criterion (Kaiser, 1958). Of the rotated 
eight factors, the seventh factor was a singleton 
and the eighth factor a doublet, while in general a 
poor simple structure was obtained (Thurstone, 
1947). Also, the number of factors was greater than 
one third of the total number of variables, a com- 
monly accepted criterion of adequacy of factoriza- 
tion. In view of these considerations, the first six 
principal-axis factors were rerotated to the varimax 
criterion. The rotated factor matrix based on six 
factors of the principal-axis solution is shown in 
Table 1. In addition to the orthogonal solution an 
oblique solution was derived using the promax 
rotation method (Hendrickson & White, 1964).+ 


8An independent maximum likelihood factor 
analysis (Lawley & Maxwell, 1963) which was car- 
tied out on the same data, revealed only six signifi- 
cant factors, thereby supporting the decision to con- 
sider six factors as descriptive of the common factor 
space, 

4To reduce printing costs the following tables: 
1, the matrix of intercorrelations among the 21 sub- 
scales, with communality estimates in the diagonal; 
2. the unrotated principal-axis matrix based on the 
correlation matrix having stabilized communality 
estimates in the diagonal; 3. the orthogonal trans- 
formation matrix; 4. the primary factor pattern co- 
efficient matrix of the oblique, promax solution; and 
5. the matrix of correlations between primary 
factors have been deposited with the American 


TABLE 1 
Roratep Factor Matrix (VARIMAX) 


I Il Tr IV V VI h? 

1. Mood 372 414 090 045 —013 217 367 
2. Pessimism 327 229 —041 207 135 240 280 
3. Sense of failure 493 —030 023 086 191 218 336 
4. Lack of satisfaction 247 464 171 142 —079 173 362 
5. Guilty feeling 626 115 —004 018 —059 —038 410 
6. Sense of punishment 562 033 —148 024 —021 —076 346 
7. Self-hate 427 246 —015 —082 009 117 263 
8. Self-accusation 529 —117 021 —023 081 —078 307 
9. Self-punitive wishes 464 000 086 238 227 057 334 
10. Crying spells 045 197 —002 —054 343 062 165 
11. Irritability —007 151 067 —033 151 546 350 
12. Social withdrawal 077 149 070 723 101 —013 567 
13. Indecisiveness 330 357 077 222 225 —008 342 
14. Body image 065 —012 —038 107 469 039 238 
15. Work inhibition —064 642 —011 178 040 033 451 
16. Sleep disturbance 014 111 392 —294 251 —345 434 
17, Fatigability —080 553 056 —042 161 —107 355 
18. Loss of appetite 001 100 675 043 046 —018 470 
19. Weight loss —065 —035 687 051 —083 099 496 
20. Somatic preoccupation 055 359 —024 —020 062 092 145 
21. Loss of libido 074 264 193 068 289 081 207 
Sum of squares 2.079 1.681 1.208 0.855 0.731 0.673 7.227 

% Total variance 9.9 8.0 5.8 4.1 3.5 3:2 34.4 

% Common variance 28.8 23.3 16.7 11.8 10.1 9.3 100.0 


26 T. E. Weckowicz, W. Murr, AnD A. J. CROPLEY 


DISCUSSION 


The varimax rotation of six factors ap- 
proximated simple structure relatively well. 
These factors accounted for 34.4% of the 
total variance, and 88.9% of the common 
variance based on eight factors. In addition, 
a promax oblique rotation was carried out, 
giving essentially the same factorial structure 
with low intercorrelations between the factors. 
Thus, orthogonality of the factorial structure 
was accepted. 

Using .300 as the criterion of significance 
for a factor loading, it can be seen from the 
inspection of Table 1 that only the first three 
factors have a sufficient number of significant 
loadings to permit interpretation." The others 
are either singletons or doublets. 

The first factor loads significantly on 
Guilt Feelings, Sense of Punishment, Self- 
Accusation, Sense of Failure, Self-Punitive 
Wishes, Self-Hate, Depressed Mood, Inde- 
cisiveness, and Pessimism. It may be inter- 
preted as a factor of “guilty depression.” In 
this combination of symptoms the inner 
experience of sadness and guilt is prominent, 
and the higher mental processes, relating to 
the self-concept and the meaning of personal 
existence, seem to be involved. The second 
factor is identified by high loadings on Work 
Inhibition, Fatigue, Lack of Satisfaction, De- 
pressed Mood, Somatic Preoccupation, and 
Indecisiveness. The loading on Loss of Libido, 
while less than the stated criterion, is rela- 
tively high. An important aspect of this factor 


Documentation Institute, Order Document No. 9027 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton, D. C. 20540. Remit in advance $1.25 for micro- 
film or $1.25 for photocopies and make check pay- 
able to: Chief, Photoduplication Service, Library of 
Congress, 

5 Holzinger and Harman (1941, pp. 122-36) and 
Harman (1960, p. 439) have proposed a method of 
estimating the standard error of a factor loading 
based on the average correlation of the R matrix and 
the number of subjects. Further, they have pro- 
posed a test of significance for a factor loading 
based on this estimation of its standard error. The 
application of this test to the data of the present 
study resulted in a factor loading equal to or greater 
than 0.25 being significant at $< 01. However, 
Harman (1960) suggests a more stringent level of 
significance for factor loadings. Therefore the .300 
(P< .001) criterion seems to be reasonable. 


is the zero loading on Self-Punitive Wishes, 
which would indicate that patients scoring 
high on this factor could be considered as 
presenting a low suicidal risk. This factor 
could be called a factor of “retardation,” ® It 
indicates an involvement of lower mental 
processes concerned with the functioning of 
the body and its vital energy. 

The third factor is defined by high loadings 
on Weight Loss, Loss of Appetite, and Sleep 
Disturbance, and can be described as a factor 
of “somatic disturbance.” It seems to be even 
more “physiological” in its connotation than 
the second factor, and it also loads low on 
Self-Punitive Wishes. 

As mentioned previously, the remaining 
three factors dre essentially singletons or 
doublets. However, in the fifth factor, the 
loading on Loss of Libido approaches the 
stated criterion of significance. This factor 
loads relatively high on Body Image, Crying 
Spells, and Loss of Libido. Recognizing that) 
an interpretation of this factor is doubtful, 
nevertheless, it could be called a factor of 
“tearful depression” with some suggestion of 
hysterical features. 

Thus, Beck’s inventory has yielded three 
clearly defined factors with a possibility of 
a fourth factor. The first three rotated factors | 
account for only 23.7% of the total variance 
and 68.8% of the common variance. This, as 
well as relatively low factor loadings, reflects 
the fact that intercorrelations among the 
items are relatively small and therefore the 
internal consistency of the test is low. 
This impression is confirmed by a Kuder- 
Richardson 20 reliability coefficient of .53 for 
the reported sample. Beck et al. (1961) re- 
ported a corrected split-half reliability of .93. 
However, the present sample was limited only 
to subjects scoring 17 or more, thereby re- 
stricting the range and lowering the internal 
consistency of the scores.7 


| 


® Giving names to factors tends to be somewhat 
subjective and arbitrary. In the case of the second 
factor, perhaps a better name, not implying the 
conventional “retarded-agitated” dimension of de- 
Pression, would be “a loss of vital energy,” having f 
a more purely somatic connotation, 

‘The Kuder-Richardson 20 reliability coefficient 
based on the total sample of 391 patients was found 
to be .78, 


BECK INVENTORY OF DEPRESSION 


As further comment on the factorial inter- 
pretation it can be added that, while the 
cluster of symptoms defining the first factor 
is concerned, as it were, with the “spiritual” 
aspect of self, the clusters of the second and 
third factors are concerned with the “animal” 
or physical aspects of self. Thus, there is a 
suggestion of a hierarchical organization of 
psychic processes, probably related to dif- 
ferent levels of ontogeny, underlying the dif- 
ferentiation of the dimensions of depression. 
This suggestion is in agreement with the 
theory of epigenesis of depressive illness 
proposed by Yonge (1966). 

It is of some interest to compare the 
factors obtained in the present study with 
those obtained in other studies and try to 
establish the invariance of those factors. Of 
course, mathematically objective comparisons 
would require using some common tests and 
rotating to maximize similarities, but, since 
depression scales sample the same universe of 
symptoms, they involve considerable overlap. 
Consequently, it is legitimate to compare the 
descriptions of factors obtained in various 
studies. 

As far as the reproducibility of Hamilton’s 
factors is concerned, there is a similarity be- 
tween Factor I called “guilty depression” in 
the present study and Hamilton’s first rotated 
factor. Both load highly on depressed mood, 


guilt, suicidal tendencies, and load low on- 


somatic complaints. There is also some simi- 
larity between his Factor II and the second 
factor obtained in the present study. Both 
factors load highly on somatic complaints and 
load low on guilt and suicidal tendencies. 

A similarity can also be reported between 
the first factor extracted by Grinker et al. 
(1961) from their “feeling and concerns” 
check list, characterized by feelings of hope- 
lessness, failure, sadness, unworthiness, guilt, 
and internal suffering, and Factor I, “guilty 
depression,” obtained in the present study. 
Comparing the present study with that 
of Overall (1962), the traits loading high 


8 Although Hamilton rotated his factors to simple 
structure, for an unexplained reason he used the 
unrotated factors for interpretation. He also seems 
to have failed to distinguish between factors 
which are dimensions of individual variation and 
nosological entities, which are classes of patients. 
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on the “guilty depression” factor of the 
present study are split between his first, 
“depression in mood” and second, “guilt” 
factors. There is some similarity between his 
fifth factor, “subjective experience of impair- 
ment in functioning,” and the “retardation” 
factor of the present study. In addition there 
is a similarity between his seventh, “physical 
response to stress” factor, and the factor of 
“somatic disturbance” reported here. 

With regard to Friedman’s study, there is 
a striking similarity between his first factor 
of “classical mood or affective depression” 
and the first factor of “guilty depression” 
obtained in the present study. Also, his sec- 
ond factor, characterized as a “retarded, 
withdrawn, apathetic depression,” bears some 
similarity to the second, “retardation,” factor 
of the present study. Friedman’s third factor. 
described as a “primarily ‘biological reaction, 
with loss of appetite, sleep disturbance, 
constipation, work inhibition, and loss of 
satisfaction,” has some resemblance to the 
third, “somatic disturbance,” factor found in 
the present study. However, the symptoms of 
“work inhibition” and “loss of satisfaction” 
loaded oppositely on the second and third 
factors of the two studies. Some similarity 
can also be discerned between his “oral- 
demanding” depression and the fifth factor 
of “tearful depression” tentatively reported in 
this paper. There is also some similarity be- 
tween the second, bipolar factor of Kiloh and 
Garside and the general factor of “endoge- 
nous depression” of Rosenthal and Klerman, 
on the one hand, and the factor of “guilty 
depression” obtained by the present authors, 
on the other. In fact, Hamilton’s, Grinker’s, 
Friedman’s, Kiloh’s and the present authors’ 
largest factors are very similar. This largest 
factor may be a factor of “affective depres- 
sion” rather than the general factor, as 
alluded to by other authors, 

As a final comment it may be pointed out 
that the factors obtained in the present study 
seem to separate symptoms involving higher 
psychological functions from those of lower. 
This finding requires further confirmation 
before any conclusions regarding the dimen- 
sions of depressive illness can be made, for 
it may well be an artifact of the restricted 
sample of items in the Beck inventory. The 
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present factor analysis was based on the com- 
plaints of patients and their subjective experi- 
ence. It may not necessarily represent the 
dimensionality of the total clinical picture 
which should include also a description of the 
behavior of the patients.” Repeated factor 
analytic studies, both cross-sectional and 
longitudinal, which include symptom ratings, 
behavior’ ratings and physiological measures, 
are important for an understanding of the 
nosology of depressive illness. 


? The description of the second factor as that of 
“retardation” was an inference based on the sub- 
jective complaints of the patients, An objective 
observation of the patients’ behavior could have 
resulted in the description of the first factor as that 
of “retarded” depression on which patients diagnosed 
as suffering from “endogenous depression” would 
score relatively high. The second factor would then 
become that of “neuroticism” on which patients 
suffering from “exogenous depression” would score 
relatively high, thus more closely conforming to the 
accepted clinical usage. 
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Bright teenage boys enrolled in special educational programs for academic 
underachievers and for high achievers were administered the Parental Attitude 
Research Instrument (PARI) with instructions to complete the inventory the 
way their mothers would respond. The PARI was also administered to the 
mothers, The 2 groups of boys did not differ in perceptions of maternal hos- 
tility, but the underachievers perceived their mothers as significantly higher 
on maternal control. There were no significant differences between maternal 
attitudes avowed by the 2 groups of mothers, although there was a trend 
suggestive of more control avowed by mothers of the high-achieving boys. 
Much greater differences between mothers’ avowal and sons’ perceptions were 
found in the underachieving group, with the most pronounced discrepancies 
being evidenced on measures of maternal control. Whereas mothers’ and sons’ 
scores correlated significantly for the control factor in the group of high 


achievers, 


there were no significant associations between attitudes ascribed 


to their mothers and actual attitudes avowed by mothers of the underachievers. 


Recent years have witnessed an increasing 
research interest in relations between parental 
attitudes and child behavior (e.g., Miller & 
Swanson, 1958; Sears, Maccoby, & Levin, 
1957; Whiting & Child, 1953). In the wake 
of these comprehensive reports of program- 
matic researches conducted in varied social 
and cultural settings have come numerous 
empirical studies devoted to more circum- 
scribed facets of parent-child relations. Many 
of these studies have utilized the Parental 
Attitude Research Instrument (PARI), de- 
veloped originally by Schaefer and Bell 
(1958), which has been found (Schaefer, 
1961; Zuckerman, Ribback, Monashkin, & 
Norton, 1958) to provide measures of two 
main factors—one labeled hostility-rejection 
and the other labeled authoritarian-control. 

Directly relevant to the present research 
is a study by Drews and Teahan (1957) who 
focused on relations between maternal atti- 
tudes and academic achievement in junior 
high school students. A group of gifted stu- 


1This research was made possible by grants to 
the Department of Education at Brown University 
from the Carnegie Corporation in support of the 
academic potential project and from the National 
Science Foundation in support of the summer science 
program, 
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dents (IQ of 130 or higher) was dichotomized 
into high achievers and low achievers on 
the basis of school grades, and another group 
of students with average intelligence was 
similarly classified. Using a questionnaire 
similar to the PARI, these investigators found 
that the mothers of the high achievers were 
more authoritarian and more restrictive in 
their child-rearing attitudes than were the 
mothers of the low achievers. Moreover, the 
mothers of the high-achieving gifted children 
evidenced more punitive attitudes with respect 
to child rearing. The authors point out that 
their results fit with Gough’s (1953) earlier 
findings that academically successful high 
school students tended to be conforming, or- 
derly, docile, and conventional. The impres- 
sion to be gained from these researches is that 
academic high achievers are likely to come 
from a family situation in which the adults 
feel they know what is best for the child and 
are willing and able to see that the child 
conforms to standards set by adult authority 
figures. 

Heilbrun has conducted research utilizing 
a methodology very similar to that employed 
in our investigation. In one study (Heilbrun, 
1960), he matched a group of schizophrenic 
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girls and a group of normal girls for age 
and educational level and compared their per- 
ceptions of their mothers’ child-rearing atti- 
tudes. That is, he had the schizophrenic 
daughters and normal daughters predict the 
responses their mothers would give on the 
PARI. The mothers of the girls in the two 
groups were also administered the PARI and 
it was found that they did not differ in their 
avowal of child-rearing attitudes. The finding 
of no differences between PARI responses 
obtained from mothers of schizophrenics and 
mothers of normal children is consistent with 
findings reported previously by Zuckerman, 
Oltean, and Monashkin (1958). 

Although the two groups of mothers in 
Heilbrun’s study did not differ, the schizo- 
phrenic daughters, to a markedly greater 
extent than the normal daughters, perceived 
their mothers as possessing pathogenic (so- 
cially undesirable) child-rearing attitudes. 
When the data were analyzed in terms of 
the primary factors measured by the PARI, 
Heilbrun found that the schizophrenics’ per- 
ceptions of their mothers’ attitudes relating 
to the authoritarian-control dimension were 
most markedly abnormal, while their percep- 
tions of maternal attitudes indicative of the 
hostility-rejection dimension were not sub- 
stantially different from the perceptions of 
the normal daughters. 

In a more recent study, Heilbrun and Mc- 
Kinley (1962) employed college girls in the 
attempt to see what effect incipient psycho- 
pathology (rather than blatant abnormality) 
might have on daughters’ perceptions of 
mothers’ child-rearing attitudes. They di- 
chotomized the daughters into a group with 
abnormal tendencies and a group without 
indications of abnormality on the basis of 
personality profiles derived from the Min- 
nesota Multiphasic Personality Inventory 
(MMPI) and then compared their responses 
to the PARI which was administered under 
the instructions of completing the inventory 
the way they believed their mothers would 
respond. In keeping with the earlier study of 
institutionalized schizophrenics, Heilbrun and 
McKinley found that the college girls with 
tendencies toward psychopathology perceived 
their mothers as being more controlling and 
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more hostile than did the girls who evidenced 
no signs of psychopathology. 

Our primary research interest was neither 
in daughters nor in psychopathology, but was 
focused on academic attainment in teenage 
boys. We utilized a procedure similar to that 
described by Heilbrun, but for a purpose 
more akin to that of Drews and Teahan in 
their study of maternal attitudes and aca- 
demic achievement. More specifically, bright 
underachieving and high-achieving secondary 
school boys were asked to complete the PARI 
the way they believed their mothers would 
respond to the items, and an identical form 
of the instrument was independently admin- 
istered to the mothers. It was hoped that 
comparison of the mothers’ actual avowal of 
child-rearing attitudes and their sons’ percep- 
tions of these maternal attitudes might lead 
to increased understanding of mother-child 
relations in families rearing sons who are 
outstandingly successful students and in those 
less fortunate families in which highly intel- 
ligent sons are in danger of not making the 
grade in an ordinary school setting. 


METHOD 
Setting 


During the summer of 1963, Brown University 
conducted two programs of special education for 
secondary school students. One was a science pro- 
gram for bright, academically successful boys and 
girls who had demonstrated outstanding potential 
for careers as scientists, In order to qualify for this 
program, these students had to be in the top 10% 
of their high school class, with grades indicative of 
distinction in several science courses, and high 
recommendations from their school administrators. 
The other program was an academic potential 
Project for bright junior high school boys who 


were failing in their regular school situation. In | 


order to qualify for this program, the boys had 
to possess superior intellectual ability, failing or 
barely passing academic records, and letters from 
school authorities indicating that if the boy’s moti- 
vation were different, he would do superior academic 
work, 

Both groups of students were in residence at the 
University for a 6-week period, but the two pro- 
grams were completely independent. The summer 
Science program was designed to provide the high- 
achieving youngsters with accelerated and enriched 
education in the sciences in the hope of maximizing 
their further academic and professional attainments. 
The underachievers attended daily classes in con- 
ventional school subjects and Participated in an 
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organized program of academic and extracurricu- 
lar activities designed to alter the motivation of 
these boys in the hope of salvaging their talents 
and remedying their academic difficulties to enable 
them to go on to higher education. 


Underachieving Boys and Their Mothers 


One group consisted of 55 boys who were enrolled 
in the academic potential project. The average age 
in this group was 14.5 years and the average IQ 
was 128. All of them were experiencing pronounced 
academic difficulty and were judged by teachers and 
guidance personnel to be functioning far beneath 
their potential. The mothers of 48 of these boys also 
served as subjects in this study. Although we do 
not have statistical data in this regard, most of these 
women were married and the sons came from intact 
families, with the majority being representative of 
middle-class society. A few of the boys were from 
broken homes and/or lower-class backgrounds, and 
a few were from upper-class families, but in almost 
all cases it can be said that the parents were con- 
cerned about their child’s situation and hoped that 
he would improve academically. 


High-Achieving Boys and Their Mothers 


The other group consisted of 31 boys who were 
enrolled in the summer science program. The average 
age in this group was 16.5 years and the average 
IQ was 133, All of these boys were using their 
intellectual ability to full advantage as evidenced 
by their outstanding academic records and glowing 
letters of recommendation from teachers and prin- 
cipals, The mothers of 29 of these boys also par- 
ticipated in this investigation. Again, we have no 
systematic data to report, but the general impres- 
sion was that these boys were largely from “ntact 
middle-class families with a few from the lower and 
upper social classes, and the majority consisted of 
parents who encouraged and appreciated their child’s 
academic attainments, 


Parental Attitude Research Instrument 
(PARI) 


Description. The PARI is designed to reveal 
parental attitudes toward family life and child rear- 
ing. The modified short form of the inventory em- 
ployed in the present study consists of 30 items to 
which the subject indicates agreement along a 
4-point scale running from “strongly disagree” to 
“strongly agree.” The statements cluster into six 
areas (or dimensions), each represented by five 
items. These areas, with an example of the type 
of item comprising each of them, are as follows: 
(a) marital conflict—“People who think they can 
get along in marriage without arguments just don’t 
know the facts,” (b) irritability—‘Raising children 
is a nerve-wracking job,” (c) rejection of home- 
making—“A young mother feels ‘held down’ because 
there are lots of things she wants to do while she 
is young,” (d) ascendancy—“A married woman 
knows that she will have to take the lead in family 


matters,” (e) intrusiveness—“A good mother wants 
to have a share in all her child’s experiences,” and 
(f) deification—“Parents deserve the highest esteem 
and regard of their children.” 

Scoring. Responses to each item receive a score 
from 1 to 4, depending upon the degree of agree- 
ment avowed by the respondent, making a minimum 
possible score of 5 and a maximum possible score of 
20 for each of the 6 areas. Factor analyses have 
shown that scores derived from the measures of 
marital conflict, irritability, and rejection of home- 
making, form a factor termed “hostility,” while 
scores indicative of ascendancy, intrusiveness, and 
deification, form a factor termed “control.” The 
minimum possible score for each of these two major 
factors is 15 and the maximum is 60. Scores from 
both factors can be combined to provide a “total 
score” which indicates the extent of hostility plus 
control evidenced by the subject’s responses to the 
PARI. 


Procedure 


The PARI was included in a battery of psycho- 
logical tests administered to the underachieving boys 
and the high-achieving boys in group-testing ses- 
sions. The instructions for this administration of 
PARI were as follows: 


Below are a group of questions about family 
life and child rearing. Please answer these ques- 
tions the way you think your mother (or the 
person who has substituted as a mother-figure in 
your life) would answer them. In other words, 
we want to see how you think your mother would 
respond to these questions if she were answering 
them. There are no right or wrong answers to 
these questions. They merely indicate attitudes 
and opinions on family life and children. If you 
feel that you don’t know how your mother would 
answer some of the questions, just give the best 
guess you can make without thinking too much 
about it. 


The students in both groups were informed that 
the psychological tests (including the PARI) were 
being given strictly for research purposes, to help 
gain greater understanding of personality and moti- 
vational characteristics related to academic attain- 
ment. Names were required on all instruments, but 
subjects were told that we would not reveal their 
tests to anyone. Completed PARIs were obtained 
from all boys enrolled in the two summer projects. 

At the end of the 6-week session, the identical 
PARI, but with different instructions, was mailed 
to the mothers of all of the boys who had partici- 
pated in the research, The instructions accompanying 
this form were: 


Below are a group of questions about your 
opinions and ideas about family life and child 
rearing. We are sending these questionnaires to 
the mothers of boys who attended the summer 
projects at Brown University in hopes of finding 
out how family attitudes are related to the aca- 


32 i ANTHONY Davs AND PETER K. HAINSWORTH 


demic progress and interests of the boys. The 
returned questionnaires will be treated confiden- 
tially for research purposes. There are no right 
or wrong answers, so answer according to your 
own opinion. 


Completed questionnaires were received from 48 
mothers of the underachieving boys and 29 mothers 
of the high-achieving boys. 


Data Analyses 


The PARI data were analyzed statistically to com- 
pare (a) perceptions of maternal attitudes revealed 
by the two groups of boys, (b) attitudes avowed by 
the two groups of mothers, and (c) agreement be- 
tween mothers’ avowal and sons’ perceptions within 
the underachieving group and within the high- 
achieving group. 

Statistical analyses performed on data obtained 
from the total groups of boys were also computed 
for groups consisting of the 48 underachieving boys 
and 29 high-achieving boys whose mothers had also 
participated in the study. Findings from these 
slightly reduced samples were practically identical 
with those obtained from the groups of 55 and 31 
boys, That is, for all comparisons the same signifi- 
cant or nonsignificant results would be presented 
whether the statistics were computed on the basis 
of the slightly reduced or the entire group of boys. 
In view of these findings, it was decided to utilize 
the data from the total samples of boys for all 
group comparisons and to utilize only the reduced 
samples whenever the analysis required comparison 
of a specific boy with his own mother. In other 
words, where discrepancies between an individual 
mother and her son were involved, the statistics are 
based on groups of 48 pairs and 29 pairs, but when- 
ever groups were compared the entire samples of 
boys and mothers were utilized. 


TABLE 1 
COMPARISON oF PARI SCORES IN THE Two Groups or Boys 


Hypothesis Testing versus Explorations 


One could derive hypotheses from various theo 
retical viewpoints or formulate predictions on the 


of the relevant literature reveals considerable ine 
consistency, conflict, and ambiguity in regard to 
theoretical issues, methodology, and interpretation of 
findings. In view of this state of affairs, it seems 
more appropriate to regard this venture as explo 
tory rather than primarily hypothesis testing, with 
the view that whatever empirical findings are un 
covered by this approach to understanding of 
parent-child relations are likely to be of value 
future investigators who endeavor to chart these 
waters which at this point are far from being) 
adequately fathomed. 


RESULTS 


The findings presented in Table 1 reveal no 
significant differences between the under 
achievers’ and high achievers’ perceptions of 
hostility in their mothers’ attitudes toward 
family life. However, the two groups of boys 
do differ markedly in their perceptions of 
maternal attitudes that form the control fa 
tor, with the underachievers describing the 
mothers as higher on ascendancy, intrusive 
ness, and deification. On every one of these 
dimensions, the differences between perceive 


r 


Underachievers High achievers 
(N = 55) (N =31) 
Variable Mean Variance Mean Variance F t | 
l 
Marital conflict 15.38 9.15 14.39 | 
arital c ; 10.93 1.19 1.42 
Trritability : 13.62 11.71 13.71 12.76 1.09 12 
Rejection of homemaking 12.98 9.30 12.94 10.05 1.09 06 
Hostility factor 41.98 66.75 41.03 55.08 1.21 54 
Ascendancy 13.53 11.05 11.00 12.06 1.09 3.358 
Intrusiveness 15.15 7.79 12.81 17.46 2.24** 2.797% 
Deification 15.82 8.98 12.71 16.06 1.79* 3.77" 
__ Control factor 44.49 46.17 36.52 92.37 2.00* 4.08" 
Total 8642 151.15 77.55 186.33 1.23 3.10" 
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TABLE 2 
COMPARISON OF PARI SCORES IN THE Two GROUPS OF MOTHERS 
Underachievers’ High achievers’ 
mothers (N = 48) mothers (N = 29) 
Variable Mean Variance Mean Variance F t 
Marital conflict 14.17 16.40 14.52 8.48 1.93* 44 
Irritability 12.71 17.30 13.59 13.65 1.27 96 
Rejection of homemaking 9.58 12.21 9.21 11.27 1.08 1.17 
Hostility factor 36.46 101.70 37.31 61.34 1.65 AQ 
Ascendancy 10.08 13.73 11.66 16.37 1.19 1.70 
Intrusiveness 8.60 10.51 9.83 15.61 1.49 1.46 
Deification 9.90 16.18 10.66 17.78 1.10 78 
Control factor 28.58 63.53 32.14 96.89 1.53 1.65 
Total 65.04 201.99 69.45 172,56 1.17 1.38 


* Significant beyond the .05 level, 


Comparisons of the actual attitudes avowed 
by the two groups of mothers, as shown in 
Table 2, reveal no statistically significant 
differences. Similarities between these atti- 
tudes are especially noteworthy on the dimen- 
sions comprising the hostility factor, with a 
mean difference of only 1 point on this overall 
factor. However, the findings for each of the 
three dimensions which constitute the control 
factor reveal consistent, but nonsignificant 
trends in the direction of somewhat more 
control being avowed by mothers of the high 
achievers. For the overall control factor, the 
4.5 difference between means, with higher 
scores in the mothers of the high achievers, 
approaches the .10 level of significance (two- 
tailed test), and is at least suggestive of a 
factor worthy of consideration in future 
research, 

Having made these comparisons between 
the groups of boys and between the groups 
of mothers, now let us compare perceptions 
of the boys with avowal by the mothers. As 
shown in Table 3, on every comparison the 
underachieving boys perceive more negative 
attitudes than their mothers actually avow. 
The only dimensions that do not show a sta- 
tistically significant difference are marital 
conflict and irritability. However, the rejec- 
tion of homemaking scale reveals a highly 
significant difference, with the boys perceiving 
much more of this attitude in their mothers 
than the mothers are willing to avow. For 
the overall hostility factor, the difference 


between the boys and mothers is significant 
beyond the .01 level. 

When one proceeds to the analysis of di- 
mensions constituting the control factor, the 
differences become even more pronounced, 
with the intrusiveness scale and the deifica- 
tion scale showing differences that attain very 
high levels of significance. The difference of 
16 points between the mean score for the 
boys’ perceptions of maternal control and 
the mothers’ mean score on this overall con- 
trol factor not only yields an extremely high 
t of 10.97, but also reveals a disparity of 
truly great magnitude. 

By comparison with the above results, the 
differences between perceptions of the high- 
achieving boys and attitudes avowed by their 
mothers are much less pronounced. As shown 
in Table 4, the two scales indicative of rejec- 
tion of homemaking and intrusiveness are the 
only ones yielding statistically significant ¢ 
tests. There are definite trends (p = .10, two- 
tailed test) in the direction of greater hostil- 
ity and greater control being perceived by 
the sons than avowed by the mothers, and the 
total combined score is significantly greater 
in the boys’ perceptions than in the mothers’ 
avowal of these attitudes. However, compari- 
son of the findings in Table 3 with those in 
Table 4 makes it abundantly evident that the 
concordance between mothers and sons is 
greater in the families that have produced 
sons who are high academic achievers. 

In order to further explore the degree of 
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TABLE 3 
COMPARISON OF PARI SCORES IN UNDERACHIEVING Boys AND THEIR MOTHERS 


Underachieving Mothers 
boys (N = 55) (N = 48) 
Variable Mean Variance Mean Variance F 
Marital conflict 15.38 9.15 14.17 16.40 1.79* 1.70 
Irritability 13.62 11.71 12.71 17.30 1.48 1.22 
Rejection of homemaking 12.98 9.30 9.58 12.21 1,31 5.28% 
Hostility factor 41.98 66.75 36.46 101.17 1.52 3.07 
Ascendancy 13.53 11.05 10.08 13.73 1.24 4.98 
Intrusiveness 15.15 7.79 8.60 10.51 1.35 11.02' 
Deification 15.82 8.98 9.90 16.18 1.80* 8.37 
Control factor 44.49 46.17 28.58 63.53 1.38 10.97' 
Total 86.42 151.15 65.04 201.99 1.34 8.19 


* Significant beyond the .05 level. 
** Significant beyond the .01 level. 
*** Significant beyond the .001 level. 


concordance between responses obtained from 
mothers and sons, we paired each boy with 
his own mother and compared discrepancy 
scores found in the two mother-son groups. 
As shown in Table 5, for each of the dimen- 
sions comprising the hostility factor, and for 
this overall factor, there are no significant 
differences between the mother-son discrepan- 
cies in the two groups. For each of the dimen- 
sions comprising the control factor, however, 
there is significantly less discrepancy between 
mothers and sons in the high-achieving group. 
For the total PARI scores, the mean discrep- 


TABLE 4 
COMPARISON oF PARI SCORES IN HIGH-ACHIEVING Boys AND THEIR MOTHERS 


ancy between the underachieving boys an 
mothers is much greater (13 points) than 
comparable mean score in the high-achievi 
group. 
A somewhat different approach to ass 
ing consonance within these families was 
compute product-moment correlations 7 
tween each son’s and each mother’s scores for 
the various measures on the PARI. The cor 
relation coefficients presented in Table 6 sho 
no significant associations between mothers 
and sons in the underachieving group. For 
the high achievers, however, the correlations 


High-achieving boys Mothers 
(N = 31) (N = 29) 
Variable Mean Variance Mean Variance F t 
Marital conflict 14.39 10.93 14.52 8.48 1.29 16 
Irritability 13.71 12.76 13.59 13.65 1.07 13 
Rejection of homemaking 12.94 10.05 9.21 11.27 1.12 4.45 
Hostility factor 41.03 55.08 37.31 61.34 1.11 1.90 
Ascendancy 11.00 12.06 11.66 16.37 1.36 68 
Intrusiveness 12.81 17.46 9.83 15.61 1.12 2.85" 
Deification 12.71 16.06 10.66 17.78 1.11 1.94 
Control factor 36.52 92.37 32.14 96.89 1.05 1.75 
Total 77.55 186.33 69.45 172.56 1.08 2.35* 


* Significant beyond the .05 level. 
** Significant beyond the .01 Bek 
*** Significant beyond the .001 level. 
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TABLE 5 


COMPARISON OF DISCREPANCIES BETWEEN MOTHERS’ AND Sons’ PARI Scores 
IN THE Two Groups 


Underachievers and 
mothers (N = 48) 


High achievers and 
mothers (N = 29) 


Variable Mean Variance Mean Variance F t 

Marital conflict 1.15 26.94 — 04 14.31 1.88* 1.15 

Irritability 92 22.07 21 17.93 L23 67 

Rejection of homemaking 3.44 19.47 3.66 20.51 1.05 21 

Hostility factor 5.58 148.01 3.82 106.02 1.40 65 
Ascendancy 3.63 23.54 —.66 17.41 1.35 3.9788 
Intrusiveness 6.73 20.72 2.76 27.35 1.32 3.517% 
Deification 5.92 24.94 2.10 16.23 1.54 3.50"** 
Control factor 16.27 120.80 4.21 115.14 1.05 4729 
Total 21.69 339.81 8.03 284.04 1.20 3.26** 


* Significant beyond the .05 level. 
** Significant beyond the .01 level. 
*** Significant beyond the .001 level. 


are not only consistently positive, but the 
coefficients indicative of association on the 
ascendancy dimension and the deification di- 
mension are statistically significant. Moreover, 
for the overall control factor the degree of 
association between the sons’ perceptions of 
their mothers’ responses and the responses 
actually obtained from their mothers is sig- 
nificant. Thus, these findings in Table 6 re- 
veal remarkably little association between the 
actual scores derived from the mothers and 
their underachieving sons, only nonsignificant 
association between mothers and their high- 


TABLE 6 


Propuct-MoMENT CORRELATIONS (7) BETWEEN 
Moruers’ AND Sons’ Scores ON THE PARI 


Under- High 
achievers achievers 


Variable (N = 48) (V=29) z 
Marital conflict 02 .28 1.06 
Irritability Ree, .28 .28 
Rejection of homemaking 10 06 13 

Hostility factor 13 lt Al 
Ascendancy 05 .41* 1.57 
Intrusiveness =s 17 1.22 
Deification .03 pb bnaE AS U k 

Control factor —.13 -40* 2.26* 

Total .08 22 59 


* Significant beyond the .05 level. 
** Significant beyond the .01 level. 


achieving sons on the hostility variables, but 
statistically significant association between 
their perceptions and avowals of attitudes 
indicative of maternal control. 


Discussion 


A general impression to be gained from this 
investigation is that the control factor appears 
to be more significant as a source of differ- 
ences between vunderachievers and high 
achievers and their mothers than is the hos- 
tility factor. In attempting to account for this 
finding, it is noteworthy that the dimensions 
comprising the hostility factor pertain to the 
general home situation and relations with the 
husband, while the control dimensions refer 
more specifically to mother-child interactions. 
It seems that many of these women are will- 
ing to admit marital conflict and irritability, 
but are very reluctant to avow attitudes in- 
dicative of rejection of homemaking. In both 
the underachieving and high-achieving groups 
there is a pronounced, highly significant dis- 
crepancy between the way sons view their 
mothers’ feelings about homemaking and the 
mothers’ willingness to avow dislike for this 
aspect of the maternal role. 

While these findings in regard to the hos- 
tility factor are interesting, the most signifi- 
cant finding from this research is the tre- 
mendous discrepancy between the under- 
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achievers’ reported perceptions and their 
mothers’ avowal of parental control, with the 
boys describing them as controlling and the 
mothers avowing very little control. It should 
be realized that what the mother avows on 
the PARI, or any such instrument for assess- 
ing parental attitudes, may not be directly 
related to the way she behaves toward the 
child. Nevertheless, whether or not a mother 
were actually hostile or controlling, if a son 
reports her as being this way, this informa- 
tion should be of value in attempts to under- 
stand interactions between them. 

Two alternative possibilities suggested by 
these findings are that either the mothers are 
really low on the factor of control but their 
sons think they are high, or else the mothers 
are really high on this factor, just as their 
sons report them to be, but they are unwilling 
to acknowledge this fact. Of course, it may be 
that the underachieving boys attribute more 
control to their mothers because they want to 
blame them for their own difficulties. This 
way, they assign the control (and probably 
the responsibility) to their mothers instead of 
to themselves. 

Several other interpretations are also plaus- 
ible. For example, these findings might indi- 
cate that the failure to do well in school is a 
sign of emotional disturbance which is also 
evidenced by deviant perceptions of the 
mother, or that years of being urged by all 
manner of people to do well in school may 
have led to a generalized perception of all 
adults (not merely the mother) as excessively 
controlling. Thus, the present research design 
permits a variety of interpretations. Regard- 
less of which of these interpretations and 
speculations are correct, however, the results 
show inconsistency, ambiguity, and dissonance 
in this important domain of mother-child in- 
teraction. 

From our clinical experience working with 
these teenagers, we know that issues concern- 
ing control and discipline are of vital im- 
portance to them. Not only in casual conver- 
sations, but also in discussions recorded in 
group psychotherapy sessions, we have heard 
many of these boys describe their unresolved 
conflicts over yielding to adult authority or 
rebelling against it. In this regard, it is note- 
worthy that most of them immensely dislike 
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the orderly, rigidly enforced daily routin 
which they encountered at the beginning 
the summer program. However, by the endg 
their 6-week stay many had changed 
views and spoke very favorably about th 
benefits they had derived from this extern 
control. 

The ambivalence and confusion revealed b 
these underachievers in their perceptions d 
and reactions to, the attempts of adults { 
control them may well be related to the d 
culties they experience in exerting self-con 
trol. Our studies of these underachievers hay 
demonstrated that they tend to be hight 
impulsive and present-oriented, seeking im 
mediate gratification, with little ability M 
work toward future goals (Davids & Sidman 
1962), and with poor powers of concentratid 
in the face of distractions (Silverman, David 
& Andrews, 1963). Thus, both our obse 
tions in group therapy and the results 0 
these experiments suggest that these brig 
underachieving teenagers are deficient in he 
ego-controls that make for self-discipline a 
are in need of structured, consistent, firmi 
control from without. However, while most of 
them reacted favorably to this sort of com 
trolled environment in our summer program, 
it seems that the problem of control withit 
their family settings is far from adequatelj 
resolved. 

It is obvious that the present research ap 
proach has raised more perplexing questions 
about internal and external control in teenage 
underachievers than it is prepared to answei 
Hopefully, extensions of this research, in 
cluding fathers as well as mothers, with ob 
servations of actual behavioral interactions 
between parent and child as well as assess 
ment of avowed attitudes, should lead të 
essential information needed to fill current 
gaps in knowledge about relations betwee 3] 
parental characteristics and child attainments 
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MARITAL PROBLEMS FOLLOWING PSYCHOTHERAPY 
WITH ONE SPOUSE 


NATHAN HURVITZ 
Los Angeles, California 


Psychotherapy with 1 spouse may complicate marital problems or create 
problems where none existed. Spouses referred for marriage counseling after 
1 has been in therapy therefore present special counseling problems. The 
preferred approach in marriage counseling is based upon an examination of 
the spouses’ “exchanges,” their role interaction, and their relationship with 
each as the “significant other.” When a marriage counselor works with the 
spouses, 1 of whom has been in therapy, he examines their interaction about 
the therapy as he would any other problem area in their lives, 


Although a married adult may initiate psy- 
chotherapy with the expectation that the 
therapy, by helping him to overcome his indi- 
vidual and interpersonal problems, will also 
enhance his marriage, the therapy may ac- 
tually harm the marital relationship. The 
spouse may go to a therapist of the same sex 
—a husband to a male and a wife to a female 
—or to a therapist of the opposite sex, and 
the therapy may be conducted in various 
clinical settings. The marital problems caused 
by each of these client-therapist relationships, 
and the settings in which they occur should 
be examined to determine whether these prob- 
lems are unique to particular relationships or 
settings or are generic to psychotherapy with 
one spouse. The present paper is concerned 
with the situation in which the client is a 
woman and the therapist is a man, the most 
common. client-therapist relationship in pri- 
vate practice. The observations and conclu- 
sions presented here are drawn from the 
experience of one clinician; however, discus- 
sion with colleagues indicates that the pres- 
ent writer’s experiences are shared by other 
practitioners (Hurvitz, 1965a). 

The wife initiates therapy not because of 
specific marital problems, but because of 
feelings of inadequacy, depression, frigidity, 
psychosomatic complaints, or irritation and 
friction with her husband and/or her chil- 
dren, etc. Although her problems have devel- 
oped since her marriage, are exacerbated by 
her marriage, and seriously disturb her mar- 
riage, the therapist’s primary consideration is 
not the significance of these problems and 
complaints in her marriage. On the contrary, 


the therapist, who may be a psychiatrist 
psychologist, caseworker, etc., and who ha 
been trained and practices within a psycho 
dynamic or psychoanalytically oriented frame 
work, is concerned with her as an individ 
and becomes involved with her in an intensive 
therapeutic relationship about her stated com 
plaints which he regards as symptoms (Eiseni 
stein, 1956; Family Service Association, 
1947; Greene, 1965; Mudd, 1951; Pincus, 
1960), 

At the outset of their relationship 
therapist may inform the wife that his ob i- 
gation is to her mental and emotional heal h 
and not to her marriage (Kubie, 1956). 
advises—and may even warn her—that het 
marital problems may be intensified as a E 
sult of psychotherapy, but this is the pric 
she may have to pay for the alleviation of 
her complaints. He explains to her that a 
her therapy helps her to become healthiet, 
she may no longer be willing to remain in 4 
sick marriage. Thus, if the marriage is ded 
stroyed as a consequence of her therapy, 3 
is because the marriage met her neuroti¢ 
needs, but cannot meet her healthy needs. Alt 
though she hears the therapist—and although 
she may have heard or read that marriages 
may be broken as a result of psychotherapy- 
she is so eager to get relief from her com” 
plaints that she may regard the possibility of) 
the dissolution of her marriage as a necessary 
sacrifice for the changes she wants to achie 
for herself. And even while she considers k 
possibility, she does not believe that it will 
happen to her, f 

The therapist, utilizing the concepts and 
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techniques of his particular theoretical orien- 
tation and his personal creative skills, may 
help the wife to overcome some of her com- 
plaints, and she may gain a great deal from 
the therapy. However, the therapy itself 
eventually becomes a cause of conflict be- 
tween the spouses. More important, the ther- 
apy disturbs the relationship between the 
spouses, old difficulties and problems are com- 
plicated and intensified, new problems are 
created, and their marriage may be destroyed 
(Anonymous, 1958; Lehrman, 1962; Moran, 
1954; Pinckney & Pinckney, 1965; Pollak, 
1965). Both spouses may then be referred to 
a marriage counselor for help with their con- 
tinuing difficulties which have become more 
acute, and with their new problems precipi- 
tated by the therapy. 


PROBLEMS CREATED BY PSYCHOTHERAPY 


The wife’s involvement in therapy is an en- 
tirely new experience for both spouses. De- 
spite the fact that it is socially sanctioned, 
she nevertheless does not share information 
about her therapy with all her family and 
friends, nor does she share its more meaning- 
ful or intimate elements with her husband. 
There are many and varied practical prob- 
lems to work out about her therapy: schedul- 
ing, transportation, care of the children, ex- 
planations to others, finances, etc., which re- 
quire the cooperation of both spouses. If the 
husband is not completely cooperative, he 
may introduce new irritations into their rela- 
tionship. But in addition to these minor irri- 
tations, therapy with one spouse may intro- 
duce major problems into the marriage. Not 
all of the following problems created by ther- 
apy are found in each instance in which a 
wife has been in therapy; however, one or 
more are found to some degree. Nor are these 
problems due to the therapist’s ineffective- 
ness; they are inherent in the psychothera- 
peutic experience with one spouse. 

The therapeutic transference may compli- 
cate the basic relationship between the 
spouses. Through the wife’s identification 
with her therapist, his upper-middle-class 
ways and values become her own. Her ways 
and values and her style of life change in 
many different ways—some major and some 
minor. These changes are apparent to her 


friends, relatives, neighbors—and to her hus- 
band. Since the changes are based upon the 
ways and values of another man with whom 
he has no contact, and since they require 
changes in him, he may resent and resist his 
wife’s changes which are based upon these 
new ways and values. Through the wife’s 
identification with her therapist she may re- 
gard his interest in her as a client to be 
interest in her as a “person.” Since one of 
her complaints against her husband is that 
he does not regard her as a “person,” her 
interpretation of the therapist’s interest be- 
comes the model of her expectations of her 
husband. Her husband cannot fulfill her ex- 
pectations of him, and he becomes increas- 
ingly aware that she is not fulfilling his ex- 
pectations of her. The wife’s changes in her 
ways and values and the spouses’ inability to 
fulfill their expectations of each other further 
estrange them from each other. 

The transference may create or complicate 
specifically sexual problems between the 
spouses. When the therapist, utilizing the 
transference, explores the wife’s sexual feel- 
ings toward him, such feelings may be 
aroused if they were not already there and 
intensified if they are. Erotic fantasies, which 
cannot be fulfilled, follow about the therapist 
and affect the wife’s sexual relationship with 
her husband. The husband reports later that 
his wife may have been depressed before she 
saw her therapist, but was elated after she 
had seen him, and he may have a record to 
show that he and his wife were most likely to 
have intercourse after she had seen the thera- 
pist. The husband acknowledges that as a 
result of her therapy his wife is more active 
in intercourse. She is, however, less likely to 
respond to his sexual advances. When he at- 
tempts to stimulate her sexually, he feels 
that she is observing and evaluating him, and 
he becomes uncomfortable and ineffective; 
and when he attempts to overcome her re- 
sistance to his sexual advances, she complains 
that he does not regard her as a “person,” 
The husband learns that patients “fall in 
love” with their therapists, and he is con- 
cerned that this may have happened to his 
wife. He taunts her about her feelings for the 
therapist while he fears to express his anxiety 
about some intimacy between them. 
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- The therapy may encourage extramarital 
sexual experiences. The accepting, nonjudg- 
mental attitude of the therapist may help to 
free the wife from the shame and guilt feel- 
ings she may have about typical childhood 
and other premarital or extramarital sexual 
experiences, Although this attitude may cre- 
ate a more permissive atmosphere for greater 
sexual freedom with her husband, it may also 
create a more permissive attitude toward ex- 
tramarital sexual experiences. The heightened 
sexual feelings which are aroused in the ther- 
apy, which the wife cannot express with her 
therapist, and which she does not want to 
express with her husband may be demon- 
strated extramaritally. Such experiences give 
the wife feelings of greater sexual compe- 
tence. They also satisfy the sexual feelings 
aroused but not fulfilled by the therapist, let 
the therapist know that with his help she 
has learned how to be more sexually effective 
with a man and, in discussing her experiences 
with him, vicariously enjoy the sexual experi- 
ence with the therapist. This activity and in- 
volvement may make the wife less interested 
or effective as a sexual partner with her hus- 
band and may arouse other shame and guilt 
feelings. 

The problems that the spouses have in their 
marriage come to be regarded as less amena- 
ble to their own efforts to work them out. The 
therapist interprets the spouses’ problems to 
be due to unconscious determinants of be- 
havior, intrapsychic conflicts, and infantile 
fixations, and not to changeable elements in 
their current interaction. The wife’s therapy 
introduces her to a new way of understand- 
ing herself and others. She is aware that she 
has changed since she initiated therapy and 
believes that this is due to the therapist’s 
method of helping her to gain self-under- 
standing through the exploration of her un- 
conscious desires, etc. Since the therapist’s 
method has helped her as an individual, she 
believes that a more intensive application of 
his method may help her with her husband. 
She believes that the problems she has with 
her husband are due to inadequate and in- 
sufficient self-understanding and that these 
problems will be overcome when she gains 
proper and sufficient self-understanding with 
the guidance of the therapist. Not only does 


her therapy teach the wife a way to unde, 
stand herself, but she uses her experience j 
demonstrate her ability to understand ha 
husband, and she describes his behavior at 
cording to the concepts she learns from het 
therapist. When her husband denies her com 
petence and rejects her interpretation of his 
behavior, she tells him that this is under 
standable resistance and proof of her correct, 
ness. While the husband resents and rejects 
his wife’s psychological interpretation of hi 
behavior and motivations, he also learns thé 
jargon she uses. He tells her that the thera, 
pist is a “father figure” to her, and he finds 
reasons in her biography to justify his im 
terpretation. In doing so, he joins her in shift 
ing their differences from problems which 
arise in their interaction to a more sophisti: 
cated but fruitless name-calling. l 

The wife may regard the failure of the 
therapy to be her husband’s “fault.” When 
therapy does not achieve all that the wif 
wants from it, she complains that this is d 
to something her husband is or is not doi 
and she becomes more hostile toward him an 
more estranged from him. The wife is awatt 
that she has made significant gains due to 
her therapy; however, there may be areas in 
which she has not made as much progress a| 
she would like—as in the sexual sphere. Al 
though this limitation may be due. primarily 
to her own problems and to the complicatio 
introduced by the therapy itself, she ma 
hold her husband responsible for her inadé 
quacy and reject him on this basis. Her fe 
jection makes it more difficult for him të 
function effectively and thereby help her, and 
their problems are further complicated. 

The wife’s therapy offers her a permissive 
setting within which she disparages her hus 
band with impunity and thereby reinforce 
her negative attitudes toward him. In het 
sessions with the therapist she tends to de 
fend herself and to blame her husband with 
his shortcomings for her problems. Since thé 
therapist has no other source of informatiot 
about her husband’s behavior and attitudes, 
he tends to accept her picture of her husband 
and to comment critically about him. TH? 
wife then believes even more firmly than she 
did that her husband is responsible for bet 
problems and that he must change to conform 
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with her expectations. The husband knows 
that his wife shares their most intimate se- 
crets with the therapist in order to present 
him as incapable and inadequate, and he is 
frustrated and angry. He resents the therapist 
for having intimate, unflattering knowledge 
about him. He believes that the therapist and 
his wife are allied against him and that he 
cannot defend himself. Therefore if the thera- 
pist does want to see him, he is suspicious of 
the therapist’s purpose and he rejects the sug- 
gestion—or he is so disturbed and defensive 
when he sees the therapist that he proves his 
wife’s charges against him. 

The husband may be made to feel that he 
is a superfluous person in his wife’s therapy 
and that she does not need him to overcome 
her problems. Although he is the person who 
is most intimately involved with the wife, 
there is no place for him in her therapeutic 
experience, and he is not considered in the 
therapeutic course. However, the wife’s prob- 
lems not only have a history, they also have 
precipitants in the present and are either irri- 
tated or lessened by her expectations of her 
future—and her husband is the most impor- 
tant person in her present and future. Since 
the therapist may not give sufficient consid- 
eration to either the husband’s responsibility 
for causing or complicating his wife’s prob- 
lems, or for helping her to overcome them, 
the husband’s anxieties and resistances are 
not examined and they may subtly sabotage 
the therapeutic effort. 

The wife’s gains in therapy tend to make 
her husband feel inadequate. He recognizes 
his wife’s great regard for the therapist and 
may believe that she regards the therapist 
more highly than she regards him. He be- 
lieves that his wife values the therapist’s 
authority and judgment more than his and 
that the therapist has usurped his role as the 
family decision-maker. He believes, on the 
basis of the evidence he has accumulated, 
that his wife sometimes refuses to discuss an 
important issue with him until after she has 
discussed it with the therapist. He believes 
that his wife disparages him to the therapist 
and that the therapist agrees with his wife’s 
disparagement. These beliefs, which may or 
may not be true, arouse anxiety about his own 
feelings of adequacy. 


The wife’s therapy helps her to gain suffi- 
cient strength to function differently within 
her marriage and to impose a new interaction 
pattern upon their relationship. Through her 
therapy she gains greater self-regard, feelings 
of competence, and ability. As she improves 
in her ability to assume control and give 
direction, and her husband is not prepared 
for her changes, he may be threatened or 
overwhelmed, and their relationship is forced 
into a different and more precarious balance. 
He learns that her gains are at his expense 
and make him less able to function as a 
marital partner. He also learns that he must 
capitulate to the new relationship she defines 
for him in order to maintain the marriage. He 
may therefore become the spouse with the 
problems or complaints (Dollard & Miller, 
1950, p. 133; Giovacchini, 1965). 

The husband may resist his wife’s efforts 
to impose a new interaction pattern upon 
their relationship. He finds that because of 
the changes that have taken place in his wife 
as a result of her therapy, his characteristic 
behavior does not elicit the same response 
from her, and he is disturbed by these 
changes. However unpleasant her problems 
and complaints and their earlier relationship 
may have been, the husband prefers these 
over the new relationship she is attempting to 
establish in which his wife’s changes and 
gains appear to him to be at his expense. He 
becomes aware that he is the spouse with the 
problems or complaints. In order to reestab- 
lish or maintain their preexisting relationship 
which was not satisfying, but also was not as 
threatening as the new ways in which he is 
compelled to function, the husband is forced 
into ever more extreme behavior to constrain 
his wife in their earlier relationship. Every 
gain that she makes in therapy, which her 
husband regards as a threat to him, is met 
with such resistance and manipulation by him 
that the gain is negated. The old problems 
and complaints remain while new ones are 
created. 


TERMINATION OF THERAPY 


At some point in his wife’s therapy, the 
husband’s enthusiasm about the positive 
changes which have taken place in his wife’ 
begins to wane, and he grows concerned about 
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the ultimate meaning of these changes. He is 
aware that her complaints have lessened— 
that she appears to function more effectively 
and to feel better about herself. However, he 
is also concerned: His wife is somehow in- 
volved with her therapist in a way that dis- 
turbs their life together in general and com- 
plicates their sexual relationship in particu- 
lar, The therapist’s involvement with his 
wife’s unconscious dynamics and childhood 
experiences appears to the husband to be re- 
mote from his contern about their present in- 
teraction. He believes that her therapy offers 
her an opportunity to complain about him and 
blame him for their problems. He feels that 
he is a hindrance to his wife’s therapy and 
responsible for her lack of progress or that 
he is superfluous—that she does not need him 
to overcome her problems. He believes that 
his wife may be gaining sufficient strength to 
force him into a new relationship which she 
is changing to suit herself. He also feels that 
her gains make him correspondingly inade- 
quate, create problems for him, and that he 
must fight back in order to survive. 

The husband’s resistances become specifi- 
cally focused on the interminability of the 
therapy. He chooses this issue because it does 
not involve him and he does not have to ex- 
pose his own concerns and anxieties, and be- 
cause he believes that others will agree with 
him that his wife is vulnerable on this score. 
He may say that although his wife has obvi- 
ously been helped by her therapy, a long 
time has passed since there have been any 
additional noticeable changes, and it is there- 
fore time to stop—particularly since it is so 
expensive. The wife does not agree. She re- 
plies that as a result of her therapy she knows 
herself better; and as she continues in therapy 
and learns more about herself she will func- 
tion even better than she does now. Her hus- 
band acknowledges that she knows herself 
better, but he points out that she is not be- 
having differently. He insists that it is the 
way she behaves toward him and the children 
that makes the therapy worthwhile for him. 
The wife, in return, charges him with being 
insensitive to her problems and trying to de- 
stroy the one significant relationship of her 
adult life. 

All of the differences between the spouses 


are epitomized by their conflict about con- 
tinuing the wife’s therapy, and their inability 
to solve this problem aggravates their other 
problems. The husband’s complaints that his _ 
wife has not changed her behavior for a long 
time do not bring about the termination of 
the therapy. He therefore threatens to inform 
the therapist how he feels. The wife does not 
want her husband to intrude upon the inti- 
mate association she has developed with the 
therapist, and she fears that the precious, 
succoring experience she has with him will be 
destroyed. When she informs the therapist of | 
her husband’s concerns and complaints about 
the therapy and of his efforts to destroy their 
relationship, she does so because of her vari- 
ous concerns about her ability to function 
without the therapist’s support, to test the 
therapist’s loyalty to her, and to learn whether 
he will ally himself with her and her needs or 
respond to her husband’s pressure. On the 
basis of his evaluation of the wife’s needs 
and his response to the husband’s concerns 
about his wife’s therapy, the therapist may 
recommend that the husband enter therapy 
also. This recommendation may be accepted 
by both spouses. Each of them goes to 4 
different therapist for an indeterminate length 
of time in an effort to resolve their individual 
and joint problems in this way. 

In the type of situation considered in this 
paper, the husband rejects the therapist’s 
recommendation. He explains that he is not 
depressed, he has no psychosomatic com- 
plaints, he is effective sexually, he performs 
well on his job, and he gets along well with 
people. The only problems he has are with 
his wife, and he wants to work them out with 
her. He may communicate this information 
to the therapist through his wife or inform 
the therapist directly. The therapist under- 
stands from the husband’s complaints that he _ 
is so unhappy about his wife’s therapy that — 
he may attempt to obstruct it in various 
ways and to negate its value if it continues. 
The therapist may therefore explain to the 
wife that she has gained a great deal of un- 
derstanding and insight into the underlying 
dynamics and unconscious motivations of het 
behavior and that she and her husband arè — 
now ready to learn new ways to handle their 
practical, everyday problems. Whereupon he | 
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suggests that marriage counseling may be the 
preferred way to handle the present crisis 
with her husband, and he refers her and her 
husband to a marriage counselor. 

When the husband succeeds in severing 
the therapy relationship between his wife and 
the therapist, profound changes take place in 
the behavior and feelings of the principals be- 
cause the therapy and its termination have 
different and significant meaning for each of 
them. Whenever therapy that has been con- 
ducted along classical lines is terminated, 
there may be greater or lesser unresolved 
transference feelings. Certainly there is an 
unresolved transference if the husband has 
precipitated the conclusion of the therapy for 
whatever reasons he may have to justify his 
actions. The wife will obviously have strong 
hostile feelings against her husband. But she 
will also have negative feelings toward the 
therapist and against any subsequent thera- 
pists or counselors who may attempt to coun- 
sel with one spouse alone or with both spouses 
together. The husband will be concerned 
about his own motives for forcing the termi- 
nation of his wife’s therapy, whether his 
achievement was for both their best interests 
and for what it means for their subsequent 
relationship. The termination arouses re- 
sistance and hostility toward each other, and 
their negative feelings may preclude any 
further help for the spouses as a married cou- 
ple. Their difficulties may become more co- 
vert: they are shifted to less apparent and 
discussable areas, and the spouses resist 
further aid which may help them. Thus the 
spouses may be less likely to accept a referral 
for marriage counseling, or to undertake and 
continue with marriage counseling, than an- 
other couple in conflict, neither of whom has 
been in therapy. 

Greater marital discord after therapy with 
One spouse appears so often that it may be 
inherent in such a relationship. The experi- 
ence of the writer suggests that if the mar- 
riage were not disturbed to some extent, then 
the therapy would not be as helpful to the 
spouse seeking individual help as it could have 
been. There appears to be an element or fac- 
tor in intensive individual psychotherapy con- 
ducted along classical lines that generates be- 
havior, attitudes, and values associated with 


individualism which makes the married client 
less able to live comfortably with his spouse. 

The therapist may correctly believe that 
he has helped his client to overcome some of 
her problems and complaints. However, by 
helping her in this way he has not helped her 
to handle the central relationship of her adult 
life, her marriage—and out of this unsatis- 
factory relationship continuing problems will 
arise. The criterion for the damage done to a 
marriage after one spouse has been in therapy 
is not that the marriage is wrecked (Kubie, 
1956). If this were so, the dangers would be 
much more dramatic and obvious, and they 
would therefore be considered more carefully. 
The danger is that the relationship between 
the spouses, which should be used to benefit 
them both, is further disturbed. 


THE INTERACTION APPROACH 


Counselors and therapists therefore need a 
theoretical framework to guide them in their 
work with both spouses and ways to involve 
both spouses when one of them has psycho- 
logical problems. The growing awareness of 
the need to involve both spouses in psycho- 
therapy and/or marriage counseling has re- 
sulted in the utilization of the traditional ap- 
proaches of individual psychotherapy with 
both spouses. This development is indicated 
by collaborative, combined, concurrent, con- 
joint, multiple-impact, stereoscopic, and tri- 
adic therapies, which attempt to solve marital 
problems by uncovering unconscious sources 
of inappropriate behavior (Greene, 1965). 
What we see here is the development of new 
techniques required by circumstances and 
rationalized by a theory which is not really 
suited to the situations in which it is applied. 
It is rather more appropriate to utilize a 
concept that encompasses both spouses simul- 
taneously from the outset. Such a concept 
regards the spouses as interacting members of 
a social system they have formed and within 
which they perform the roles associated with 
this system, it regards their individual or 
joint problems as the result of the stresses 
and strains in their role transactions, and it 
uses the interaction between the spouses 
in the effort to help them with their indi- 
vidual and marital problems. 
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The counselor examines the relationship 
between the spouses in terms of their role 
interaction. He identifies and counsels with 
the marital problems of the spouses about 
their inappropriate role relationships, their 
meaning to the spouses, and the feelings 
that have been aroused in their interac- 
tion. The counselor understands that their 
roles and associated norms are learned 
so well in the process of formal and informal 
education through the various communication 
media of our society that they are internal- 
ized and determine behavior in specific situ- 
ations. Not only do the roles embody norms 
of verbal and nonverbal behavior, but 
whether or not the roles are performed ac- 
cording to the other’s expectations implies 
that certain feelings will be aroused. When 
marital roles are performed according to the 
other spouse’s expectations, the associated 
feelings are positive, leading to marital ad- 
justment and personal ease; when marital 
roles are not performed according to the other 
spouse’s expectations, the associated feelings 
are negative, leading to interpersonal strain 
and inner conflict. Since men and women in 
our society may be taught different norms of 
role performances and expectations, or since 
specific life situations and experiences of the 
Spouses may cause each of them to have idio- 
syncratic or incompatible norms of marital 
roles within the family, their role relationship 
may be strained, and both spouses may ex- 
press negative feelings about themselves, 
about the other, and about their marriage. 
The counselor, understanding the sources of 
their negative feelings, examines the reciproc- 
ity of their role performances and role expec- 
tations and the associated meaning and affect 
when these roles are or are not complementary 
(Hurvitz, 1965b). 

In the counseling setting the counselor 
helps the spouses become aware of their 
“exchanges”—concise representative examples 
of the content and character of their inter- 
action expressed verbally or nonverbally—as 
they perform their roles within the social 
system they have created. These exchanges 
may serve either to aggravate or conciliate 
their problems, and the counselor attempts 
to encourage the latter. The counselor at- 
tempts to help each spouse to account for his 


own and the other’s behavior in the exchange 
by encouraging each of them to offer inter- 
action hypotheses for his own and the other's 
behavior. He does this to learn how well each 
spouse can “take the role of the other” and 
to teach them how they can do this better, 
thus facilitating communication and under 
standing between them (Mead, 1934). Since” 
the counselor is guided by his awareness that | 
behavior is oriented toward significant others, 
he helps each spouse, as the member of an 
interacting social system, to change his own 
and the other’s inappropriate behavior and to 
understand the implied meanings and the as- 
sociated feelings. These may have beeni 
learned in interaction with different signifi- l 
cant others and reinforced because the behav- 


ior and/or the feelings have served or now 
serve some purpose for one of the spouses) 
or for someone else in their environment. AS : 
part of the counseling process, the spouses’ | 
may be helped to understand how they 
learned to behave as they do, as a basis for 
helping them to learn to behave in ways that 
are more constructive to them individually 
and more satisfying to their relationship. But) | 
they are also helped to understand that how- 
ever they may have learned their currently 
inappropriate behavior, it is now necessary 0 l 
learn the kind of behavior which enhances | 
their relationship. In his relationship with theg | 
spouses the counselor makes an aware effort 
to plan better—mutually more satisfying— | 
behavior and to encourage its repetition. He 
also encourages each spouse to respond to the” 
other’s efforts and to stimulate and encourage 
better behavior in the other, As each spousé 
responds to the supportive behavior and posi 
tive affect, he becomes a reciprocal stimulus 
for further encouragement and reinforcement | 
of their positive interactions (Hurvitz, 1967): 
Thus, when a married person goes to 3 
psychotherapist or counselor of any psycho- 
logical persuasion for whatever problems 0! 
complaints, the other spouse, as the “signifi- 
cant other,” should be involved in the therapy | 
from the outset. It is not that individual psy 
chotherapy should never be undertaken. with 
one spouse of a couple; it is that such therapy 
must always consider that the other spouse 
is intimately involved in the genesis, expres 
sion, and modification of the problem or com 
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plaint which brought his spouse into therapy. 
The initial appointment should be for both 
spouses, and joint sessions for both spouses 
should be planned. If the wife presents the 
problems or complaints, the therapist’s plan 
must include her husband even if the thera- 
pist does not have any sessions with him. 
The therapist must work with the wife as 
though her husband was participating in ther- 
apy even though the latter may not do so 
because he is unwilling, because he cannot for 
reasons beyond his control, or because the 
spouse who is entering therapy objects to his 
participation, 


APPLYING INTERACTION CONCEPTS 


When a couple, one of whom has been in 
therapy, come to a therapist or counselor, 
he must be aware that the role relationship 
between the spouses has been disturbed by 
the therapy creating additional and unique 
strains between them. The counselor therefore 
has a double task: He must consider how to 
help the spouses with their problems, and he 
must be aware of their conflict about the 
therapy—which played a part in bringing the 
spouses to him. The remainder of this paper 
indicates how the counselor works with the 
marital problems caused by, or accentuated 
by, the therapy of one spouse—which is an 
aspect of how he works in a continuing 
counseling relationship. 

From the outset the counselor should ex- 
plore the feelings of the spouses toward him 
as an individual who is filling the role of 
counselor, He may tell the spouses that he 
understands that they have come to him with 
feelings toward him which he did not stimu- 
late—that their antagonistic or negative feel- 
ings are not due to anything he has done, 
but may nevertheless make it difficult for him 
to help them. He may tell the wife that he 
understands that her relationship with her 
therapist was important to her, that it may 
have been difficult for her to accept his refer- 
ral to someone else, and that it may be un- 
comfortable for her to work with someone 
who is also counseling with her husband. He 
may tell the husband that he is aware that 
his attitude toward his wife’s therapist may 
color his attitudes toward any counselor. The 
counselor may also suggest that the husband 


may believe that his wife can manipulate him 
because of her familiarity with the experience 
and vocabulary of therapy and that he will 
develop the same kind of relationship with 
the wife as she had with her therapist. Al- 
though both spouses may disavow these atti- 
tudes that the counselor ascribes to them, 
they understand that the counselor is aware 
that there are important attitudes they have 
not expressed which will affect his ability to 
help them; and their awareness of his 
perspicacity may indicate to them that he is 
sensitive to their feelings and may therefore 
be able to help them both. 

Both spouses are encouraged to evaluate 
what the wife’s therapy meant to each of 
them and to the other. The counselor asks for 
each spouse’s evaluation or interpretation of 
the need for therapy, its accomplishments and 
failures, its effect upon the various aspects 
of their relationship and their marriage as 
a whole, and its continuing meaning for them. 
He also questions each spouse’s interpreta- 
tion of what the therapy meant to the other 
spouse; and he examines the discrepancies 
between each spouse’s interpretation and the 
interpretation ascribed to him by the other. 
The counselor may want to discuss specific 
periods or events related to the therapy such 
as the time it was first suggested, when the 
wife considered it, the time the husband sug- 
gested termination, and the time of final 
termination. These instances are explored for 
the understanding they offer about the char- 
acteristic interaction of the spouses, what 
may motivate each of them, and what each 
of them hypothesizes about the motivation 
of the other. 

The counselor may ask the wife how her 
husband shared responsibility for her prob- 
lems, how she sought help from him and 
how he responded, and what his attitude was 
when it was suggested that she go for profes- 
sional help. The counselor may also ask the 
wife why the relationship with the therapist 
became so important for her, to examine more 
carefully what she wanted from this rela- 
tionship and what she projected into it, The 
counselor points out that she had to pay a 
professional to give her the kind of attention 
and to regard her in the way she wanted to 
be regarded by her husband and urges her to 
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consider the meaning of such a situation. The 
counselor questions whether she wants some- 
thing from her husband that he cannot give 
her, something no man could give her if she 
would arouse in him the same feelings that she 
arouses in her husband. The counselor points 
out to the wife that she says that the thera- 
pist made her feel like a “person,” a feeling 
she wants her husband to give her, and asks 
how she can help him to understand, in a 
nonthreatening way, what she means by this, 
and to help him to give her this feeling. The 
counselor helps the wife to consider what her 
therapy meant to her husband from his point 
of view and what his feelings are about the 
therapy. If she says she does not know and 
does not care about his feelings, she should 
be helped to consider the consequences of 
such an attitude and whether she is prepared 
to accept them, 

The counselor may ask the husband what 
he believes were the causes of his wife’s prob- 
lems and complaints, how he tried to help her 
so she would not need to go for professional 
help, what her feelings were about his efforts 
to help her, and what his feelings were when 
she finally decided that she must have profes- 
sional help from a psychotherapist. The coun- 
selor may ask the husband why his wife’s 
relationship with the therapist became so 
threatening to him, whether his insecurity 
about the therapy is characteristic of him 
in other situations and relationships, and 
whether it was the pressures created by his 
wife’s therapy that exposed this to him. The 
counselor helps the husband to examine 
whether he feels that his wife’s experience 
with the therapist was due to his own inade- 
quacy because he had to pay another man to 
give his wife something that he was not able 
to give her. The counselor points out that his 
wife says that she wants to feel like a 
“person,” and asks him what he can do in 
their continuing relationship to help her to 
let him know what she wants from him to 
help her feel this way. The counselor encour- 
ages the husband to examine what he has not 
given his wife that she needed to get from 
from another man. The husband must also 
be helped to recognize that the therapy was 
a unique and helpful experience for his wife 
and that it therefore has special meaning for 


her. The counselor suggests that if the hus- 
band cannot accept the fact that the therapy 
will continue to have meaning for his wife, | 
he must then understand that he will create 
problems in his relationship with her; and 
he must consider the consequences of such | 
an attitude and whether he is prepared to | 
accept them. 

As the counseling continues, the counselor 
listens to the spouses’ reports of their ex- 
changes at home and to their exchanges in 
the office for problems that were accentuated 
by the therapy or which were precipitated 
by the termination of the therapy. As he 
examines and evaluates their exchanges with 
them, he must consider in what way their 
exchanges about the therapy are character- 
istic of their communication and interaction, 
and he relates these to the total pattern of 
their role relationships. Thus the spouses’ ex- 
changes about the therapy are examined like 
any other problem area in their lives, as an 
aspect of counseling based upon an interaction 
approach. 
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This study stems from a proposition common to several psychological theories: 
There is movement toward similarity or equilibrium in social interaction in 
a dyad. The interaction studied concerns values in the therapist-patient dyad. 
The sample of 38 therapists and 44 patients was obtained at 2 psychoanalytic 
training centers. The values were measured by the Ways to Live scale and 
the Strong Vocational Interest Blank. The results indicated that (a) therapists 
and their own patients were closer in values than those randomly paired, 
(b) therapists did not share a homogeneous value system, and (c) those 
patients rated as “most improved” by their therapists were closer to their 


therapists in values than patients rated “least improved.” 


The proposition 


that values move toward similarity in ongoing therapist-patient dyads was 


not refuted. 


Of signal interest in the theory of psycho- 
therapy is the following question: What rele- 
vance do the value systems of therapists and 
their patients have in the therapeutic proc- 
ess? Viewed from a broader Perspective than 
the present study, values are a necessary ac- 
quisition throughout childhood and in subse- 
quent value-changes in adult life. The minimal 
human group of two, the dyad, is one pri- 
Mary context in which socialization occurs 
(Simmel, 1964). It is particularly within this 
context of human intimacy that values germi- 
nate, develop, and change throughout life. 
What are the parameters that effect changes 
in values in adult life? This is the general 
issue of concern here, within the confines of 
the psychotherapeutic setting. 

A proposition found in several psychologi- 
cal theories is the theoretical background for 
the present research. It is: There is move- 
ment toward similarity or equilibrium in so- 
cial interaction in a dyad. This “equilibrium” 

1 This study was financed in part by the Arts and 
Science Research Fund, New York University. We 
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proposition has consensual theoretical valida- 
tion since it occurs in the diverse areas of 
dissonance theory (Brown, 1965), balance 
theory (Heider, 1958), psychotherapeutic 
theorizing (Lennard & Bernstein, 1960), 
psycholinguistics (Jaffe, 1964), developmental 
research (Escalona, 1965), and theorizing 
about values (Homans, 1961; Williams, 
1958). 

One logical inference from the proposition 
is that interaction between two people of 
some duration leads to a growing similarity 
of their values. Another is that if values are 
too divergent between two people, there may 
be so much disequilibrium that they do not 
maintain the dyad. 

“Value,” of course, is an abstraction of a 
Person’s experience and, as such, is a cogni- 
tive pattern. The function of this pattern 
is to guide conduct. The concept of value, 
in terms of content, focuses more on domi- 
nant or deviant attributes of a culture or sub- 
culture; whereas the concept of attitude is 
oriented more toward specific opinions of an 
individual. There is however, a remarkable 
overlap of the two concepts. In terms of their 
structure, that is, a cognitive pattern, their 
motivational component, and their function, 
they seem to be used interchangeably, Even 
in terms of content they are often used as 
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interchangeable concepts. Thus, for example, 
one speaks of religious attitudes or religious 
values, sexual attitudes or sexual values, etc. 
Put on a continuum values seem somewhat 
closer to mores than are attitudes, and are 
somewhat broader in scope. Values are fre- 
quently spoken of as value orientations? or 
value systems, but attitudes are not used in 
this adjectival sense. The concept of value is 
used in the present study since concern is 
directed toward a general way or preference 
for living. Value is a concept which can 
be inferred from expressed preferences or 
choices. In the present study, the concept 
refers to those ongoing cognitive patterns for 
guiding conduct which are inferred from 
verbal statements of preferred ways of living. 

There is a small body of research which 
has explored values and psychotherapy. Glad 
(1959) stresses value systems which are con- 
ceptualized in psychotherapy theory and de- 
scribes some operational dimensions of them. 
A thesis of his is that judgment of improve- 
ment during psychotherapy is contingent 
upon the similarity of the patient’s personal- 
ity to the theoretically derived methods and 
goals of treatment. Buhler (1962) studied 
therapist’s involvement with the value prob- 
lems of their patients, and values as an aspect 
of self-development and how this affects the 
process of therapy. There is a current ex- 
ploration® of the extent to which psycho- 
therapists’ values, as measured by the Ways 
to Live (WTL) scale (Morris, 1956), are 
a function of theoretical orientation, length 
of professional experience, etc. Several in- 
vestigators (Burdock, Cheek, & Zubin, 1960) 
have shown a relationship between candidates’ 
success in psychoanalytic training and simi- 
larity with their supervisors’ interest patterns, 
as measured by the Strong Vocational Interest 
Blank (SVIB). The suggestion is that trained 
analysts may be equating criteria for the 
“well-adjusted” personality with their own in- 
terest or value systems. Opinions gathered 
(Wolff, 1954) from a variety of psychothera- 
pists led to the conclusion that therapists be- 
lieved their theoretical value systems tended 
to be adopted by their patients in successful 

2See, for example, Kluckhohn and Strodtbeck, 


1961. . 
3 Personal communication, 1962. 


treatment. In a study of 12 patients before 
and after psychotherapy (Rosenthal, 1955) it 
was shown that patients who were rated as 
improved by their therapists tended to adopt 
the moral values of their therapists, at least 
on the tested dimensions of sex, aggression, 
and discipline. In another investigation 
(Rosenbaum, 1956), it was found that those 
patients who were most religious were least 
likely to benefit from psychotherapy. The 
rated Jack of improvement among religious 
patients may have been related to the gross 
disparity of values between patient and thera- 
pist. There is, thus, a small body of research 
available indicating that values have impor- 
tance in the psychotherapy dyad, and that 
patients who are judged by their therapists 
as improved move toward the values of the 
therapist. 

The present study is the beginning phase 
of ongoing research investigating values in 
the therapist-patient dyad. In addition, since 
the perception of the personality character- 
istics and behavior of others may be influ- 
enced by value discrepancy, the study ex- 
plores the therapists’ perception of improve- 
ment of their patients as it is influenced by 
value similarities and differences. 

Do psychotherapists have a wide range of 
values toward basic human ways of living? 
If so, any one patient may begin therapy 
with a therapist whose values are similar to or 
quite disparate from his own. In the light of 
the many possible therapist-patient dyadic 
relationships, the specific aims of this study 
are to explore and generalize from the fol- 
lowing propositions: 

1. Therapists and their own patients have 
more similar value systems than random pairs 
of therapist—“not-own” patients. The ques- 
tion posed here is: Are therapists so homo- 
geneous in values that when patients move 
closer to their own therapist’s values, they 
are moving closer to the values of therapists 
in general; or, when patients move closer to 
their own therapist’s values are they simul- 
taneously moving farther away from other 
therapists’ values? The extreme hypothesis 
which states that all that happens to values 
in therapy is that patients move closer or 
further away from their own therapists would 
result in the finding of no difference between 
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the two sets (therapist-own patients and 
random therapist-patients). 

This hypothesis demands a study of (a) 
the extent of homogeneity of values among 
psychotherapists and among patients and (b) 
the extent of similarity of therapists’ values 
with those of their own patients. 

2. The second hypothesis is that there is 
a relationship between similarity of patient- 
therapist values and therapists’ subjective 
evaluations of the patients’ mental health 
status: patients rated as “improved” by their 
therapist will be expected to have greater 
similarity than patients rated “not improved.” 


METHOD 
Subjects Mavs 

Subjects were psychotherapists and their patients 
at two New York City psychoanalytic training insti- 
tutes; The William Alanson White Institute of 
Psychiatry, Psychoanalysis and Psychology (hereafter 
called WAW) and the New York University Post- 
doctorate Center (hereafter called NYU). The 
sample was taken at these two institutes both to 
increase sample size and to provide some basis for 
generalization over settings. There is no reason the 
authors can think of, however, which would make 
this patient sample markedly different from samples 
that could be obtained at other psychoanalytic train- 
ing institutes in New York City. Psychotherapists 
at various institutes, of course, would vary in their 
theoretical bent. 

The psychotherapists were first- and second-year 
candidates in psychotherapy and psychoanalytic 
training. Therapists were selected from the first 2 
years of training since, at WAW, it was in these 
years of training that therapists were working in the 
Psychotherapy clinic rather than in the psycho- 
analytic service. A comparable level of training of 
therapists was maintained for the sample of candi- 
dates from NYU. All psychotherapists were psy- 
chologists or psychiatrists who had the PhD or MD 
degree plus several years of clinical experience in 
a variety of different mental health settings. They 
thus have a heterogeneous background in profes- 
sional training and experience. Of the total sample 
of 38 psychotherapists, 26 were at WAW and 12 
at NYU. Eleven of the psychotherapists from WAW 
were second-year candidates at the time of their 
testing, but were already third-year candidates at 
the time of testing patients. These therapists were 
included only in the evaluation of homogeneity of 
therapists’ values, Of a total of 15 first- and second- 
year candidates at NYU, 12 participated in the 
study. 

The sample of 44 patients were those seen by 
the therapists participating in the study. This con- 
sisted of 33 patients from WAW and 11 from NYU. 


J. Wetxow11z, J. CoHEN, AND D. ORTMEYER 


The sample of patients in this study was approxi- 
mately three-quarters of the total therapy-patient 
load seen by the therapist sample. Patients were 
not included in the sample who saw therapists for 
four or less sessions. This is the reason that not all 
the patients seen by the therapist sample were 
included in this study. The average patient was 
white, in his (her) mid-twenties, above average in 
intelligence, had high school education, worked or 
went to school, lived in the New York City area, 
and was seen in therapy once a week. 


Procedures 


The study demanded a scale that would dis- 
criminate within a relatively homogeneous group of 
therapists. Preferably it would be one that yielded 
a parsimonious set of value measures, arrived at 
by factor analysis, so that individual value profiles 
for each therapist and patient could be obtained. 
The WTL (Morris, 1956) is such an instrument. 
It is a scale which aims at describing basic human 
values, and is composed of 13 different paragraphs. 
Each paragraph describes a way of life which the 
respondent rates on a 7-point scale for the degree 
to which he personally would like to lead that way 
of life. 

The second instrument used was the SVIB 
(Strong, 1943). It taps a wide range of personal 
preferences in all areas of life which, it is assumed, 
express underlying value dimensions, This instru- 
ment, covering consequences of values over a wide 
range of interests, may include areas missed by the 
WTL scale. The SVIB also had the advantage of 
having been successfully used at the Columbia Psy- 
choanalytic Clinic (Burdock et al, 1960) to evaluat 
differences in interests among psychoanalytic candi- 
dates and their supervisors, 

In addition to the value scales, ratings by the 
therapists of extent of patient improvement were 
obtained. These ratings were made on a 6-point scale 
as follows: (1) much worse, (2) worse, (3) no im- 
provement, (4) some improvement, (5) moderate 
improvement, and (6) marked improvement. As our 
study shows, we are not interested in the usual 
efficacy of therapy ratings, but the therapist's 
perception of improvement as related to value 
similarity. 

The WTL and SVIB were administered to pa- 
tients and therapists during October and November 
of 1964. The patients had been in therapy for peri- 
ods ranging from 1 to 9 months, Twenty-nine pa- 
tients had been in therapy more than 6 months and 
the remaining 15 less than 6 months, Only two pa- 
tients had been in therapy less than 2 months, Within 
2 weeks after testing, each therapist evaluated extent 
of improvement for each of his or her patient (s). 


RESULTS 


The first hypothesis was tested (a) by 
evaluating value scores of therapists and pa- 
tients; that is, do patients and therapists 
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share common value systems? (b) by evalu- 
ating differences between therapist—“‘own- 
patient” pairs and therapist—“not-own-pa- 
tient” pairs. ' 


Heterogeneity of Therapists-Patients Value 
Scores 


The scores on the 60 scales of the SVIB 
and the 13 scales of the WTL were converted 
to standard scores on the basis of the per- 
formance of the entire group of subjects. 
Q-technique factor analyses (correlating be- 
tween persons over test items) were com- 
pleted for each instrument separately, using 
centroid-factor extraction and Varimax rota- 
tion. In both instances, six centroids were 
needed to account for the correlation matrix 
with trivial residuals. Taking into account 
both positive and negative loadings on each 
of the six factors (thus ending up with 12 
types), and using a factor loading of .37 as 
the minimum criterion,‘ it can be seen from 
Table 1 that patients and therapists are not 
differently distributed over the 12 types on 
either test. Table 2 indicates that with pool- 
ing therapists and patients the loadings are 
uniformly distributed, Thus, with regard to 
both instruments, the data suggest patients 
and therapists appear on all factors.” Hetero- 
geneity of therapists is operationally defined 


4A factor loading of .37 was used since loadings 
below this indicated a higher loading on another 
factor. Only the patient’s or therapist’s highest factor 
loading was extracted from each matrix. 

5 The chi-square test is approximate because of 
small expected frequencies. However, the effect of 
small frequencies is to positively bias chi-square 
and make it too high. But chi-square is still not 
significant and the bias is against our demonstration. 
Also, the number of degrees of freedom is high, and 
the resulting test which is approximate is adequate 
to the multinomial. 


TABLE 1 


CHI-SQUARE Tests or No DIFFERENCE IN DISTRIBU- 
TION OF FREQUENCIES BETWEEN PATIENTS AND 
THERAPISTS OVER THE TWELVE Factors (Six 

Posrtive, Stx NEGATIVE) 


x df ? 
SVIB 5.08 11 <.95>.90 
WIL 11.28 11 <.50>.30 


TABLE 2 


CHI-SQUARE Tests OF COMBINED PATIENTS AND 
THERAPISTS DISTRIBUTION OVER THE TWELVE 


FACTORS 
x df $ 
SVIB 8.39 il <.70>.50 
WTL 6.72 11 <.90>.80 


here in multidimensional terms, that is, thera- 
pists’ loadings on different value type factors 
on the SVIB and WTL, Since there is no 
great concentration of therapists in some types 
relative to others, the a priori possibility that 
therapists share a homogeneous value scheme 
appears to be negated.° 


Congruence of Therapists’ Values with Those 
of Their Own Patients 


In order to find out whether, at an initial 
point in therapy, similarity in value scores 
between therapist-own-patient pairs, is or is 
not significantly different from therapist-not- 
own-patient pairs, it would have been neces- 
sary to obtain value scores at the beginning of 
therapy. Since patients had already been in 
therapy from 1 to 9 months at the time of 
the initial testing, the proposition tested was 
that therapists and their own patients, after 
interaction in therapy, are more similar in 
values than random pairs of therapists and 
patients. 

To test this propostion, two sets of cor- 
relation coefficients were selected from both 
the WTL and SVIB matrices: 

1. The correlations of patients with their 
own therapists. (All therapist—“own-patient” 
pairs were used.) 

2. The correlations of therapists with ran- 
domly selected, not-own patients. (Therapists 
were selected from the matrices using a table 
of random numbers.) 

The therapist-own-patient pairs were sepa- 
rated into two groups: those rated as some, 
moderate, or marked improvement (improved- 
patient group), and those rated no improve- 


6 The verbal description of the type factors, their 
relationships between SVIB and WTL, is lengthy 
and will appear in another publication now in 
preparation, 
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TABLE 3 


lye 
ANALYSIS OF VARIANCE AMONG MEANS OF CORRELATIONS BETWEEN THERAPISTS’ AND 
PATIENTS’ Scores ON THE SVIB anp WTL 


Improved | Not-improved Not-own 
patients patients patients df MS F 
and their and their and their 
therapists therapists therapists 
SVIB 
i 20 19 40 
Sir g it a 2 26 02 
Source of variance F 6,955.0 Py 
poe groups We 1083.9 
WTL 
Number of pairs 18 17 40 
WS j 37 11 —.05 
perth aang 7 10,216.5 15.9* 
Eror 72 614.2 
*p <01 


ment, worse, or much worse (not-improved- 
patient group). An analysis of variance was 
performed among the three sets of mean 
correlations. The results are presented in 
Table 3. 

Using the Duncan new multiple-range 
test for unequal numbers of replications 
(Kramer, 1956), significant differences were 
found on both instruments between (a) 
therapists-improved patients and therapists- 
not-own patients and (b) therapists—not- 
improved patients and therapists-not-own pa- 
tients. (Significant differences were also found 
between therapist-improved patients and 
therapists—not-improved patients). These re- 
sults support the hypothesis that the value 
similarity between therapists and their own 
patients is greater than the value similarity 
between therapists and random not-own pa- 
tients. Since patients were not tested before 
initiation of therapy and length of time in 
therapy before testing varied, Pearson cor- 
relation coefficients were obtained between 
similarity of values and length of time in 
therapy (to the nearest month). The result- 
ing coefficients were r= 49, P< .01 for 
SVIB and r= 42, p< .05 for the WTL. 


Thus, it appears that value similarity tends 
to increase as a function of length of time 
in therapy. 

The second hypothesis tested was that 
there is a relationship between similarity of 
patient-therapist values and therapists’ sub- 
jective evaluations of the patient’s mental 
health status. The Improvement Rating scale 
was dichotomized by combining the categories 
of much worse, worse, and no improvement 
into one group and some improvement, 
moderate improvement, and marked improve- 
ment into a second group. For each patient, 
we extracted from the correlation matrix the 
correlation coefficient between the patient and 
his therapist on the instrument in question. 
These coefficients were then treated as 
“similarity-to-therapist” scores. A biserial cor- 
relation was obtained between the dichoto- 
mized improvement scores and the similarity 
scores. The resulting coefficients were r = .45, 
= 01 for the SVIB and r= 36, p <.05 
for the WTL. These results indicate a signifi- 
cant relationship between extent of patient- 
therapist value similarity and perception of 
patient improvement by the therapist. N 

Assuming patients and therapists were not — | 

i 
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initially matched for value similarity, the 
“improved” patient is moving closer to his 
own therapist’s values than is the “unim- 
proved” patient. Since therapists do not share 
common value systems, the patient is simul- 
taneously more divergent with the value 
position of some other therapists. 


Discussion 


To return to the theoretical proposition, 
that there is movement toward similarity or 
equilibrium in social interaction in a dyad, 
the logical inference drawn from this proposi- 
tion was that interaction of some duration be- 
tween two people leads to a growing similarity 
in their values. In this study, the duration 
of the dyad varied from 1 week to 9 months 
at the time that values were sampled. The 
results indicated that the value distance be- 
tween therapists and their own patients is 
closer than the distance between randomly 
paired therapists and patients. Does this find- 
ing reasonably give credence to the proposi- 
tion and its inference? 

It would be crucial if therapists were able to 
select the patients they desired to treat or vice 
versa. If so, then the value similarity may 
not have been due to the interaction in the 
dyad but due to the selection process. This 
would be important information in its own 
right. That is, do we, given the opportunity, 
select people with whom to relate who share 
our values? For the present study, however, 
it would only confuse the results. The selec- 
tion procedures at WAW and NYU were es- 
sentially the same and as follows: The intake 
social worker saw each applicant for an initial 
interview. The social worker then assigned 
the patient to a therapist on the basis of 
practical considerations, such as time avail- 
able. The therapist had the right not to see 
the patient for therapy. Almost without ex- 
ception, however, the therapist did continue 
to see the patient for psychotherapy. The 
exigencies and functioning of clinics which 
serve the community mental health needs, 
unfortunately, do not allow for random pair- 
ing of patients with therapists. The practical 
conditions of pairing, however, do not sug- 
gest value similarity or other related meas- 
ures, While it cannot be excluded, it seems 
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reasonable to doubt that the initial selection 
was made on the basis of value similarity. 

There was a significant correlation between 
length of treatment and similarity of patient- 
therapist values. This, at the very least, sug- 
gests that value similarity may have increased 
as a function of time. The evidence available, 
therefore, points more in the direction of 
value similarity being a function of the thera- 
peutic interaction than the initial selection. 
While the present results do not prove the 
central proposition, there is more reason 
than not to believe that they support our 
theoretical point of view. 

The statement that interaction of some dura- 
tion between two people leads to similarity in 
their values can be cast in more precise terms 
as follows: There is a process of unlearning 
values and learning other values. The impli- 
cation here is that there is a qualitative dif- 
ference or difference in “kind” of values. It 
is very possible that the patients in this study 
had values not qualitatively divergent from 
their therapists since they were young, 
bright, upwardly mobile patients. The value 
divergence then would be a quantitative one. 

If it was a quantitative one, part of the 
overall picture of these patients may have 
been a vagueness and confusion about values, 
a lack of explication of them and perhaps 
an unsuccessful upholding of contradictory 
values. One function, then, of the therapy 
dyad was to allow the patients to move on 
a continuum toward greater commitment and 
explication of values. These suggestions are 
in line with a study (Morris, Eiduson, & 
O'Donovan, 1960) in which the WTL scale 
was given to 50 outpatients of a mental hy- 
giene clinic, their spouses or closest friends, 
and a control group. One of their findings 
was that patients do not repudiate the values 
of the culture, but have difficulty in 
achieving them. A conclusion of theirs is that 
“Psychotherapy . . . aims at... not the 
changing of values of the disturbed person 
but helping him to develop more effective 
techniques of realizing those values he holds 
[p. 310].” 

Since, in the present study, the therapists 
are no more homogeneous in values than pa- 
tients, an individual patient presumably could 
apply for treatment with a therapist who 
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varies greatly from his values. Suppose the 
. extreme case were taken in which the total 
patient sample was quite divergent in values 
(qualitatively so) from the therapists. This 
would seem to be the case with blue-collar 
workers or patients of a different cultural 
and/or racial background. It is a well-known 
finding that the blue-collar workers’ dropout 
rate from therapy is very high. Is it pos- 
sible that values in these dyads are so 
divergent that the dyad has little chance of 
continuing? This question is consistent with 
a second inference from the general proposi- 
tion previously stated: If values are too 
divergent between two people, there may be 
so much disequilibrium that they do not 
maintain the dyad. Research is needed to 
answer this question. If it is true, then thera- 
pists in their training should have explicit 
exploration of their values. They might also 
profit from experience and supervision in 
treating patients of cultural or subcultural 
backgrounds very divergent from their own. 
It would also seem likely that a few 
therapist-patient dyads are most divergent in 
values, yet they continue. The selection and 
close scrutiny of these dyads in further re- 
search should give clues to the treatment of 
patients with values very divergent from their 
therapists. 
~ This study was designed to provide a 
method and yield results at a molar level, to 
provide a general framework from which more 
refined data could be examined. It was not 
geared to sample values at certain time inter- 
vals nor at crucial therapeutic junctures. Im- 
portant in future research is the comprehen- 
sion of the ebb and flow of value fluctuations 
in dyads. Of equal importance is the assess- 
ment of the lasting nature of shifts in values 
after the dyad is dissolved. Following are a 
few questions, hopefully to serve as challenges 
to future research: Does the time of most 
value change occur at the period of most 
intense transference? What is the relation of 
regression to shifts in values? Are certain diag- 
nostic categories, such as hysteria, more 
prone to value changes than others, such as 
paranoia? Is intense resistence correlated with 
divergence in values? Is dissolution of the 
dyad followed by backward shifts to previ- 
ously held value positions? 


Another finding of this study was that 
there is a significant correlation between 
therapists’ ratings of patients’ improvement 
and similarity of patient-therapist values, 
Those patients who were rated as most im- 
proved were closest in values to their own 
therapists. Here we are studying one aspect 
of the therapists’ perception of their patients, 
It is a perception involving a judgment as 
to change in a patient in one direction, that 
is, a “healthy” direction. We did not use 
other criteria for “improvement” since our 
central concern was the therapist’s perception 
of improvement. We are not concerned with 
an objective appraisal of patients’ improve- 
ment. Traditionally, the therapist’s judgment 
of improvement in his patients is thought to 
test on such factors as symptom or tension 
reduction, resolution of a life problem, less- 
ened rigidity of defenses, a decrease in resist- 
ance, directness of expression of feelings, etc. 
It is also recognized that the therapist’s judg- 
ment of improvement is affected by counter- 
transference. The present study suggests an- 
other important, if problematic, correlation 
with the therapists’ judgment, that of value 
similarity. 
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WHAT UNITS SHALL WE EMPLOY? 
ALLPORT’S QUESTION REVISITED 
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The present paper is an attempt to develop alternative concepts and strategies 
for the study of personality in both research and applied settings. In place 
of traditional dispositional concepts, the implications of an abilities conception 
of personality are explored. Response capability and response performance are 
suggested as central foci for personality assessment. Personality structure is 


construed in terms of skills rather than dispositions. Individual differences in 
response to factors controlling response performance are considered with | 


regard to 3 sets of situational factors, that is, reinforcement conditions, situa- 
tion specific hypotheses, and formal attributes of situations, The implications 
of an abilities conception of personality for personality change are discussed. 


Despite the existence of a substantial body 
of theory and data, the problems which face 
the psychologist interested in personality 
theory, measurement, and assessment continue 
to appear formidable. The search for stable 
and enduring individual difference variables 
(other than intellective ones) which would 
permit prediction of behavior across varied 
stimulus situations has yielded considerably 
less than modest successes. One is increas- 
ingly tempted to argue that the time has 
come to vary our strategies in personality 

«research, Further data collection along tradi- 
tional lines may very well prove inadequate 
for the task at hand. Our current problems 
in personality appear to constitute matters of 
conceptualization rather than further data 
collection. The present paper is an attempt 
to develop an alternative strategy for the 
psychologist interested in personality in both 
research and applied settings, 

Viewed in historical perspective, it would 
not be unfair to characterize our efforts in 
personality research and theory as a search 
for viable units of study. In a provocative 
paper, Allport (1958) raised the question: 
“What units shall we employ?” As we ex- 
amine personality research in historical per- 
spective, it seems clear that much of our 
effort has been devoted to finding an answer 
to this question. While much has been 
learned, both methodological and substantive, 
from efforts to view human behavior and 
thought through templates such as need, 
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drive, trait, type, instinct, and habit, it is 
entirely possible that units of potentially 1 
greater utility are now available. | 


Structure of Personality: Limited Utility of | 
Dispositional Units | 


The majority of our structural units in the 
study of personality have centered about the 
focal concept of response predisposition. Con- 
cepts such as instinct, need, drive, and trait 
are considered energy units as well as struc- 
tural units and, as such, possess clear motiva- | 
tional implications. For example, Allport 
(1937) argued that “trait” is both a motiva- 
tional concept as well as a structural concept. 
For Allport, a “gregarious” person is not only | 
in evidence in social situations but such 4 
Person frequently arranges such situations in 
order that trait-related behaviors might be 
come manifest. Similarly, a “need-related re 
sponse” to a Thematic Apperception Test 
card is thought to have implications beyond 
the mere presence of the “need” in the per- 
sonality structure of the individual giving 
such a response. It is frequently assumed 
that the individual characterized by given 
needs will act to bring about homeostatic 
balance. 

If a response dispositional concept were 
taken to mean nothing more than a statement 


of the probability of a given response in 4 


well-defined stimulus situation, one would 
have little with which to quarrel. Thus, for 
example, while scores on various measures of 
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“intelligence” predict academic grades with 
moderate success, few persons, if any, would 
argue that intelligent persons are predisposed 
to behave intelligently. While nonintellective 
factors may very well contribute to the dem- 
onstrated relationship between measured in- 
telligence and academic achievement, to ex- 
plain this relationship entirely in motivational 
terms would appear absurd. In this instance, 
a statement of probability concerning the 
prediction of certain events suffices. 

But as we have noted, dispositional units 
in the study of personality go far beyond 
statements of response probability. Traits and 
needs are assumed to operate as motivational 
determinants of behavior across varied stimu- 
lus situations. The problem for the personolo- 
gist interested in assessing personality within 
the framework of trait or need theory has 
been one of laying bare the structural aspects 
of personality and, hence, the response tend- 
encies which presumably reside in the indi- 
vidual. The equation is as follows: struc- 
ture = predisposition = response probability. 
While a theory of generalized response pre- 
dispositions which operate across varied 
stimulus situations is appealing and elegant 
in its simplicity, it suffers from at least two 
major flaws. 

First, the attempt to devise relatively pure 
measures of response dispositions has posed 
serious methodological, operational, and con- 
ceptual difficulties which appear formidable. 
Elsewhere, this writer (Wallace, 1966) has 
argued that typical measures of response pre- 
disposition are very likely confounded by a 
neglected but highly important response prop- 
erty, response capability. For example, a 
“hostile” response to a projective stimulus 
may suggest nothing more than the individual 
giving such a response is capable of a verbal 
response of this nature in this particular con- 
text. However, to assume that such an indi- 
vidual is predisposed to respond with hostility 
in other situations and through response 
modes other than verbal ones is clearly un- 
warranted. Continued problems in the at- 
tempt to assess dispositional units raises 
serious questions about the ultimate utility 
of such units. 

Aside from problems of measurement, the 
loose equivalence of structure, disposition, 


and response probability constitutes the sec- 
ond major weakness. As recent research in 
trait theory has indicated, efforts to predict 
behavior from measures of personality struc- 
ture have yielded considerably less than mod- 
est successes (e.g., Brim, Glass, Lavin, & 
Goodman, 1962; Endler, Hunt, & Rosen- 
stein, 1962; Hilgard, 1965; Petersen, 1965; 
Rorer, 1965). While a more sophisticated 
trait model such as that employed by Kogan 
and Wallach (1964) may result in increased 
prediction, it appears likely that our failures 
to predict behavior from measures of person- 
ality structure are attributable in large part 
to neglect of psychosocial variables in the 
assessment situation itself as well as in the 
criterion situation to which predictions are 
made. t 


Response Capability and Response 
Performance as Focal Concepts 


As alternatives to dispositional units in the 
study of personality, the strategy presented 
here involves the two focal concepts of re- 
sponse capability and response performance. 
The task of the personologist interested in 
assessment is inescapably twofold. First, an 
analysis of the structure of personality should 
center about the extant response repertoire, 
that is, the response capabilities of the indi- 
vidual. Stated quite simply, if one wishes to 
get a person to perform a given response, one 
must first determine whether or not the per- 
son is capable of performing the response. 
Second, one must specify the conditions 
necessary for performance of the response. 
Statements about the structural aspects of 
personality are, in and of themselves, mean- 
ingless unless one can specify the conditions 
under which the individual can demonstrate 
his capabilities. Obviously, questions concern- 
ing response capability and response perform- 
ance are related considerations. It would ap- 
pear impossible to assess response capability 
without making some statement about condi- 
tions controlling response performance. As 
researchers in personality have come to ap- 
preciate, observations are always gathered in 
some situational context (e.g., Masling, 1960; 
Murstein, 1963; Rotter, 1960). 

Construing personality in terms of response 
capability rather than response disposition 
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leads one quite naturally to a definition of 
personality structure in terms of abilities. 
Accustomed to dispositional concepts, the 
personologist may think it odd indeed to con- 
sider such things as “hostility,” “gregarious- 
ness,” “nurturance,” “assertiveness,” “de- 
pendence,” etc., as skills rather than disposi- 
tions. However, a moment’s reflection would 
readily suggest that it is not only possible but 
possibly fruitful to construe psychological 
phenomena such as these in this manner. 
Thus, for example, when observers are in 
agreement that a particular individual has 
performed a sequence of responses that can 
be labeled “aggressive,” it would appear rea- 
sonable to assert that the individual is capa- 
ble of assuming an aggressive role. And the 
student who seeks assistance from others be- 
fore a given examination can be described as 
possessing the skill of dependency. 

At first blush, it might appear that a capa- 
bilities conception of personality as presented 
here amounts to little more than a play upon 
words. However, our choice of units of study 
in personality is a matter of far greater sig- 
nificance than is immediately apparent. A 
skills conception of personality structure pos- 
seses vastly differing implications for the 
problems of personality measurement and 
change than those indicated by various dispo- 
sitional conceptions. As this writer has shown 
(Wallace, 1966), an abilities conception of 
personality leads one to measurement opera- 
tions which are quite at odds with those de- 
manded by dispositional conceptions. And as 
will be developed shortly, the implications for 
personality change which stem from an abili- 
ties conception are strikingly different from 
those which obtain within dispositional con- 
cepts. In addition to differing implications 
for both measurement and change, an abili- 
ties conception of structure permits one to 
avoid the persistent epistemological quan- 
daries inherent in dispositional conceptions, 
How does one, for example, decide upon the 
level at which the teal dispositions of the 
person (whatever that may mean) can be 
found? And when faced with the inevitably 
incompatible evidence, how does one decide 
upon the stimulus situations in which such 
dispositions can be expected to appear in 
behavior? 


In contrast to dispositional descriptions of 
structure, response-capability descriptions do 
not comprise statements of response probabil- 
ity. While the description of personality 
structure in terms of capabilities has implica- 
tions for response performance, factors other 
than capability enter into the prediction of 
response performance. Obviously, while an 
individual cannot perform a response which 
is not in his repertoire of responses, the mere 
fact that it is does not guarantee that the re- 
sponse will be performed. In other words, the 
factors which control response performance 
do not inhere in response capability. 

Recent research on social learning by Ban- 
dura (1965a) demonstrates that capabilities 
cannot be expected to eventuate in actual be- 
havior until appropriate incentive conditions 
are introduced. In Bandura’s research, chil- 


dren observed a film-mediated model engage | 


in highly novel, aggressive responses. The 
model’s aggressive behavior was punished, 
rewarded, or left without consequences in 


three experimental conditions. Postexposure | 


tests revealed that response consequences for 


the model resulted in differential amounts of | 


imitative behavior of the novel aggressive re- 


sponses. Children who had observed the model | 


being punished for aggressive responses dis- 
played significantly less imitative behavior 
than children who had observed the models 
rewarded or left without consequences. How- 
ever, when reinforcements contingent upon 
reproduction of the model’s responses were 
offered directly to the children, differences in 
performance were totally eliminated. In short, 
while the children in all conditions were capa- 
ble of Teproducing the model’s behavior, ap- 
propriate incentives had to be introduced 
before response performance could be ob- 
tained. Bandura’s research suggests that dis- 
crepancies between response capability and 
Tesponse performance can be expected under 
conditions of negative sanctions, 

While some Personologists have chosen to 
view situational factors as matters best left 
to the social Psychologist or experimental 
Psychologist, it seems to be the case that situ- 
ational determinants of behavior are as much 
the legitimate domain of the individual dif- 
ferences psychologist 
chologist Searching for general stimulus-re- 


as they are of the psy- ^ 
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sponse laws. Moreover, an examination of 
situational determinants of behavior, as we 
shall see, can prove most congruent with the 
interests of the personologist. In the follow- 
ing discussion, three sets of situational fac- 
tors are presented as follows: reinforcement 
conditions, situational specific hypotheses, 
and formal properties of situations themselves. 


Reinforcement Conditions 


In considering situational factors, our at- 
tention is immediately drawn to individual 
differences in response to various conditions 
of reinforcement. The old adage that “one 
man’s meat is another man’s poison” holds 
true for reinforcement. Even the most casual 
observation of human beings suggests that 
various classes of incentives do not have equal 
appeal. Examination of recent research in 
behavioral modification (e.g., Krasner & Ull- 
mann, 1965) indicates clearly that the assess- 
ment of persons in clinical settings might well 
include procedures for determining the effec- 
tiveness and appropriateness of various classes 
of reinforcing stimuli. 

Staats and Staats (1963) convincingly ar- 
gue that differences in social learning histories 
can be expected to produce differences in 
preferences for given classes of reinforce- 
ments. Social learning theorists have long 
been concerned with the concept of reinforce- 
ment value. Rotter (1954), in his social 
learning theory, assigned an important role to 
the concept of reinforcement value. More 
recently, Homans (1961) has equated rein- 
forcement value with the economic theoretical 
construct of utility. In short, the fact that 
human values range over an extraordinary 
span of events should alert the individual dif- 
ferences psychologist to the importance of 
including procedures for the assessment of 
such matters, 

General statements concerning the effec- 
tiveness of given classes of reinforcers should 
be qualified by considerations of temporal 
matters. The personologist should also con- 
cern himself with the important question of 
the effectiveness of given reinforcers over 
time. It may well prove to be the case that, 
for given individuals, reinforcers who main- 
tain behavior for short intervals will prove 


to be ineffective for long term behavioral 
maintenance. 

In addition to questions concerning the 
effectiveness of given classes of reinforcing 
stimuli, the assessment of individual differ- 
ences might very well take into account other 
reinforcement conditions. The scheduling of 
reinforcements seems an additional matter of 
importance. For different persons, behavioral 
maintenance may be enhanced by different 
schedules of reinforcement. A number of re- 
searches by Mischel and his colleagues (egs 
Mischel, 1961a, 1961b; Mischel & Metzner, 
1962) indicate clearly that differences can be 
expected in the ability of persons to sustain 
delay of noncontingent reinforcement. And 
closely related to individual differences in 
ability to tolerate delay in noncontingent re- 
inforcement is the matter of persistence of 
behavior in the absence of immediate con- 
tingent reinforcement. Even the most casual 
observation of persons suggests that differ- 
ences in behavioral persistence can be ex- 
pected under varying conditions of delayed 
contingent reinforcement. Thus, an important 
assessment question might center around the 
ability of the individual to sustain effort over 
temporal intervals of various lengths prior to 
receipt of reinforcement. 

A further consideration in the examination 
of individual differences in response to rein- 
forcement conditions centers about the pat- 
terning of reinforcements (Crandall, 1963). 
The individual can be expected to respond 
differently to nonreward embedded in a 
series of rewards as opposed to nonreward 
embedded in a series of punishments. And as 
is the case with other reinforcement condi- 
tions, one may very well expect to find dif- 
ferences among individuals in responsiveness 
to such patterning of reinforcing stimuli. 


Situational Specific Hypotheses 


Conditions of reinforcement constitute only 
one set of situational factors of interest to the 
individual differences psychologist. Situational 
specific hypotheses of the individual are a 
second important class of variables. While it 
might appear odd to discuss hypotheses of the 
individual in situational rather than struc- 
tural terms, there is some logic in this ap- 
proach. It would certainly appear to be the 
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case that individuals develop hypotheses con- 
cerning themselves, others, and events in 
which they are involved. And the possibility 
‘that such hypotheses are important in the 
mediation of behavior cannot be overlooked. 
However, the generality of various subjective 
hypotheses of the individual remains open to 
question. As with other constructs, one would 
expect to find that the predictive utility of 
subjective hypotheses would decrease as 
specificity is sacrificed to more general state- 
ment. Thus, for example, when we ask a sub- 
ject in personality research to describe him- 
self on a trait dimension such as hostility, we 
are asking the subject for a rather complex 
inference, that is, a hypothesis concerning 
himself. To the extent that the subject at- 
tempts to answer such a question in the ab- 
stract, that is, without reference to his be- 
havior in actual situations, we would expect 
him to experience increasing degrees of un- 
certainty in arriving at a conclusion concern- 
ing himself. In essence, self-report inventories 
frequently require the subject to engage in 
rather complex inferential processes. In arriv- 
ing at tenable hypotheses concerning himself, 
other things being equal, the subject will be 
in a position to reduce uncertainty in direct 
Proportion to the availability of relevant 
validating information. Some evidence from 
studies of self-prediction of performance sup- 
ports this line of reasoning. Mischel (in press) 
demonstrated that when subjects were per- 
mitted increased knowledge of a specific situ- 
ation, the utility of their hypotheses concern- 
ing future performance, that is, self-predic- 
tions, increased as well. 

It would appear to be the case, then, that 
hypotheses concerning oneself, others, and 
events in which one is involved, gain in pre- 
dictive utility to the extent that such hy- 
potheses are tied to specific situations. Hence, 
it would appear appropriate to consider such 
hypotheses in situational terms rather than as 
structural aspects of the personality. 

Individual differences in expectancies con- 
cerning success or failure in specific situa- 
tional contexts constitute one set of hypoth- 
eses of individuals worthy of the attentions 
of the personologist. As with other hypotheses 
of the individual, the question of the predic- 
tive utility of generalized expectancies is an 


open one. Rotter (1954) argued that a 
expectancy is composed of two components 
generalized expectancy (GE) and expectancy 
specific to the task at hand (E’). How 
ever, a number of researches suggest th 
expectancies are clearly influenced by sit 
ational factors (e.g, James & Rotten 
1958; Phares, 1957; Rotter, Liverant, & 


Staub (1965) suggests that when response 
outcomes are fairly clearly defined by success+ 
failure information in specific situations, 
effects of generalized expectancies are mini- 
mal. Generalized expectancies, in the researt 
by Mischel and Staub, affected delay of reind 
forcement choices only when the subjects 
were confronted with a situation in which no 
information relevant to success probability 
was provided. In Atkinson’s (1958) achieve- 
ment-motivation model, expectancy is tied 
directly to situational manipulations with 
some success in the prediction of behavior. 
These findings suggest that E’ is likely of 
greater predictive utility than GE. 
In addition to situation specific hypotheses 
concerning success-failure, individual differ- 
ences in hypotheses concerning noncontingent 
reinforcing stimuli from social agents in the 
individual’s environment constitute another 
important assessment question for the per- 
sonologist. Without question persons can and 
do develop hypotheses about probabilities of” 
noncontingent reinforcements from other per 
sons in their social environments. That is, 
individuals come to think of specific persons 
in their interpersonal worlds as predisposed 
to behave toward them with “hostility” of 
“warmth” or “indifference,” etc., irrespective i 
of the justification for such behaviors: 
Whether or not the other person is, in fact, 
predisposed to behave in certain ways toward 
the individual is not of importance here. How- 
ever, the fact that the person being as 9 
has developed subjective probabilities of non- 
contingent reinforcement from specific others 
is obviously of considerable import. 
Situation specific hypotheses involving suc- 
cess-failure and probabilities of noncontingent 
reinforcements from specific others are illus 
trative of situational concerns of importance | 
to the individual differences psychologist. Ad- 
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ditional variables of importance which involve 
subjective hypotheses can be developed for 
particular assessments. 


Formal Properties of Situations 


A third class of situational variables con- 
trolling response performance involves dif- 
ferences in situations themselves in terms of 
their formal properties. While the assessment 
of persons should never proceed without ref- 
erence to some stimulus situation, it is un- 
fortunately the case that assessments which 
do take account of situations are the excep- 
tion rather than the rule. The attempt to 
predict to unknown situations is fraught with 
difficulty. 

The multifarious dimensions along which 
situations vary must, of necessity, render the 
discussion presented here suggestive rather 
than exhaustive. However, examination of 
several important formal attributes of situa- 
tions themselves is illustrative. For example, 
social situations vary considerably in the 
range of behaviors considered acceptable and 
desirable. Relative freedom versus constraint 
for various forms of behavior in given social 
contexts can and does exert powerful influ- 
ences over behavior. A very large portion of 
both individual and group behavior is pre- 
dictable from knowledge of role appropriate 
and inappropriate behaviors in given social 
contexts, 

Social environments differ in terms of the 
availability and accessibility of social rein- 
forcers of various kinds. An exploratory study 
by the author (Wallace, 1963) revealed that 
teachers in elementary schools varied remark- 
ably in terms of their properties as “rein- 
forcement machines.” One teacher adminis- 
tered verbal rewards and punishments in an 
approximate ratio of twenty rewards to a 
single punishment. Another teacher showed 
the reverse, administering approximately 
twenty verbal punishments to each reward! 
Striking differences in the behavior of the 
children in these two classes were apparent. 

Finally, recent research on social learning 
(Bandura & Walters, 1963) indicates clearly 
that the personologist interested in behavior 
prediction must examine vicarious response- 
reinforcement contingencies which obtain in 
the individual’s social environment. Without 


question, the performance of social behaviors 
is importantly influenced by observation of 
reinforcement outcomes experienced by oth- 
ers. Thus, for example, if one wished to pre- 
dict the occurrence of “deviant” behaviors 
for a given individual, one might well be 
advised to concentrate upon the frequency 
with which such behaviors are displayed by 
significant group members and the conse- 
quences for such behaviors. 

Obviously, the attempt to relate personality 
characteristics of any kind to behavior can- 
not proceed with any degree of success in the 
absence of detailed information about given 
social contexts. Rotter (1960) has so effec- 
tively argued the need to examine the situation 
carefully in personality assesment that it 
would appear redundant to pursue the matter 
in detail here. 

Despite arguments to the contrary, exami- 
nation of situational factors controlling re- 
sponse performance can be viewed as most 
congruent with the interests of the personolo- 
gist. Murray (1938), in his analysis of presses 
as well as needs, called attention to the neces- 
sity to consider the individual in some en- 
vironment. It would appear that in their 
search for a consistent man personologists 
have confused rather strikingly the two con- 
cepts of consistency and generality. While a 
man’s behavior may prove remarkably con- 
sistent under given sets of stimulus condi- 
tions, it is quite another matter to expect him 
to show generality of such behavior across 
markedly different conditions. To the extent 
that he does show invariant behavior across 
varied stimulus situations, one might be 
tempted to suggest that something is quite 
wrong with him. After all, one might best 
define psychotic behavior as behavior un- 
affected by stimulus variations, that is, be- 
havior under inappropriate stimulus control 
or in the absence of stimulus control. 

A capabilities conception of personality has 
implications for decision making in assess- 
ment situations, Once answers have been ob- 
tained to questions concerning the conditions 
under which the individual can demonstrate 
his capabilities, cost-decision considerations 
could center about two important questions. 
First, in cases where environmental flexibility 
permits, one would attempt to assess the costs 
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involved in manipulating situations so as to 
provide optimal conditions necessary for maxi- 
mal performance. In essence, concern with the 
costs involved in “fitting the right situation 
to the man” would be uppermost in this case. 
Secondly, in cases where environmental flexi- 
bility does not obtain, cost-decision consid- 
erations would center about the costs in- 
volved in developing necessary response rep- 
ertoires in given individuals in order to meet 
situational demands. Approaching personality 
from this perspective would permit the per- 
sonologist to engage more directly the grow- 
ing concern with human potentialities. 


Implications for Personality Change 


The implications for change which have 
stemmed from dispositional concepts of per- 
sonality structure have proven most prob- 
lematic. When the structure of personality is 
construed in terms of stable and enduring 
response dispositions which have reached full 
development early in the developmental his- 
tory of the individual, the possibility of 
change seems limited. Moreover, conceiving of 
personality structure in terms of various en- 
ergy units poses serious difficulties for alter- 
ation of such structure. How does one, for 
example, modify an instinct? Similarly, other 
than providing for temporary satiation of 
given needs, how does one go about altering 
the patterning of an individual’s needs in 
some fundamental sense? In recognition of 
the fact that it is difficult to modify energy 
directly, dispositional theorists have ap- 
proached the problem of personality change 
through various tactics. Redistribution of 
energy (sublimation), release (catharsis), and 
control through uncovering of unconscious 
determinants (insight) have constituted three 
important tactics of change. 

Within recent years, each of these tactics 
for producing behavioral change has been 
challenged sharply. London (1964) has ex- 
amined critically the assumption that insight 
into putative unconscious behavioral disposi- 
tions is of value in producing behavioral 
change. Recent well-controlled research by 
Bandura (1965b) appropriately calls into 
question the validity of the catharsis notion. 
As Bandura appropriately points out, the ex- 
pression of socially unacceptable behaviors 


such as hostility would appear to lead to an 
increase in the probability of further expres- 
sion rather than a decrease. Furthermore, one 
would expect such expression to be further 
enhanced if it eventuates in positive rein- 
forcement, and also, if inhibitory factors over 
further expression are eliminated. Redistribu- 
tion of energy tactics are based upon ques- 
tionable assumptions such as “symptom sub- 
stitution.” It is frequently assumed that one 
cannot change a single aspect of the person- 
ality without giving rise to the expression of 
various other compensatory symptoms. Hence, 
if one were to treat a phobia successfully by 
direct means without addressing oneself to 
the putative “underlying conflict” upon 
which the phobia is based, the patient would 
be expected to display some other symptom, 
Recent evidence from a wide variety of be- 
havioral modification settings (e.g., Krasnet 
& Ullmann, 1965; Wolpe, 1958) suggests) 
that symptom substitution not only does not) 
occur, but one may expect generalization of 
positive effects in the successful treatment of 
isolated symptomatic behaviors. | 

While the implications for personality 
change inherent in disposition conceptions 
have proven most problematic, those stem- 
ming from a capability conception of struc- 
ture are straightforward indeed. When thel 
structure of personality is construed in terms 
of units of skill, the problem of personality 
change is seen as one involving the develop: 
ment of response repertoires in conjunction 
with the selection and maintenance of situa 
tions which will enhance performance. Sig 
nificant strides along these lines have al 
ready been made in behavioral modification 
procedures involving positive reinforcement 
(eg., Bandura & Walters, 1963; Ferstet 
1961; Krasner & Ullmann, 1965), assertive 
behavior training (Wolpe, 1958), fixed-role 
therapy (Kelly, 1955) , and vicarious learn- 
ing through use of behavior models (Bandura; 
1965c). 

It should not be assumed that procedures 
for enlarging response capabilities must, of 
necessity, be restricted to small behaviot 
units. Kelly’s (1955) fixed-role therapy det 
onstrates the feasibility of working with 
large and complex units in developing 1° 
sponse repertoires. And interest is developing 
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in the possibility of employing role theoreti- 
cal constructs as units of change in positive 
reinforcement settings (Krasner & Ullmann, 
1965). Moreover, the use of modeling pro- 
cedures can provide the means through which 
very large behavioral sequences can be trans- 
mitted to an observer. Nor should it be 
assumed that increasing the individual’s capa- 
bilities is restricted entirely to behavioral or 
action units, It is entirely conceivable that 
cognitive strategies of various kinds can be 
taught directly to individuals in efforts to 
increase their capabilities. Thus, for example, 
a portion of the psychotherapist’s time might 
well be devoted to teaching the patient ap- 
propriate decision-making strategies and tac- 
tics. Similarly, direct tuition of the patient in 
strategies for seeking, assimilating, and uti- 
lizing information necessary for the valida- 
tion of hypotheses he develops concerning 
himself, others, and events in which he is 
involved can be seen as an integral part of 
psychotherapy and other behavioral modifica- 
tion settings. An approach to the problems of 
human adjustment in terms of tactics and 
strategies as an alternative to the classical 
approach of defenses has been developed fully 
elsewhere (Sechrest & Wallace, in press). 

This, then, has been an attempt to take 
seriously the important question raised by 
Allport (1958), “What units shall we em- 
ploy?” Construing personality in terms of the 
two central foci of response capability and 
response performance leads one to concern 
with units not suggested by dispositional con- 
cepts. A psychology of personality in which 
we choose to search for that of which man is 
capable would appear to have important im- 
plications. Moreover, these implications would 
appear to be meaningful and relevant to the 
scientific investigation of personality, whether 
we choose to work at the level of theory and 
research or that of application. 
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VALIDITY AND RELIABILITY STUDIES OF A COMPUTER-BASED 


SCORING SYSTEM FOR INKBLOT RESPONSES: 


DONALD R. GORHAM 2 


Veterans Administration, Perry Point, Maryland 


A computer scoring system has been validated for scoring 17 Holtzman Ink- 
blot Technique Variables: Location, Rejection, Form Definiteness, Color, 
Shading, Movement, Integration, Human, Animal, Anatomy, Sex, Abstract, 
Anxiety, Hostility, Barrier, Penetration, and Popular. The basic sample con- 
sisted of 145 college students to whom the HIT was group administered. 
An expert scorer’s values were the criteria for validating computer scores. 
Validity of computer scoring was attested by comparability of means and 
standard deviations, by acceptable correlations between the 2 methods, and 
by identical factor structure among 8 rotated factors. The correlation of the 
computer with the average of 3 hand scorers equaled or approached the 
interscorer reliability of the scorers. Cross-validation studies demonstrated that 
equally satisfactory results were obtained for both Forms A and B. Finally, 
the computer was able to achieve scores from group records which were 
essentially equal to scores from records individually administered 1 wk. earlier 


and hand scored. 


In former reports (Gorham, 1965; Mose- 
ley, Gorham, & Hill, 1963), the development 
of a computer-based system for scoring ink- 
blot responses has been described. The com- 
puter scoring method was applied to a test 
with a clearly defined scoring method with 
one response per stimulus: the Holtzman 
Inkblot Technique (HIT) (Holtzman, 
Thorpe, Swartz, & Herron, 1961). If it could 
be demonstrated that the computer could ap- 
proximate the values of hand scorers for this 
test, computer scoring of less well-defined 
other projective techniques might be at- 
tempted. In this paper, evidence for the valid- 
ity and reliability of the method will be 
presented. 

The development of the system involved 
group administration to expedite the collec- 
tion of large samples and the creation of a 
computer program to simulate hand scoring. 
The group method was a slight modification 
of the Holtzman group administration 


1 Portions of this paper were presented as part 
of a symposium, “Problems in Cross-cultural Re- 
search,” at the Tenth Interamerican Congress of 
Psychology, April 6, 1966, Lima, Peru. 

2 The author is principal investigator of a project 
supported by the Veterans Administration and by a 
three-year NIMH Grant 10273. Coinvestigators are 
Edward C. Moseley, Consultant to NASA, and 
Wayne H. Holtzman, Dean, School of Education, 
University of Texas. 
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(Swartz & Holtzman, 1963). Each response 
was limited to six words and any attempt at 
inquiry was eliminated. Responses to each of 
the 45 HIT blots were key punched on IBM 
cards, which became the raw data for com- 
puter scoring. Form A was used for develop- 
mental purposes. The computer scoring pro- 
gram was evolved through the following steps: 

1. The building of an empirical list of 
words used by subjects to describe inkblots. 
The current dictionary of 6,100 words con- 
sists of all words used more than once by 
1,200 subjects (54,000 inkblot responses) 
from the United States, Hong Kong, Aus- 
tralia, Panama, Denmark, Germany, Mexico, 
India, Turkey, Japan, and Yugoslavia. 

2. Assigning scoring weights to each of the 
6,100 words to score 17 Holtzman Inkblot 
Technique (HIT) variables, Location (L), 
Rejection (R), Form Definiteness (FD), 
Color (C), Shading (Sk), Movement (M), 
Integration (J), Human (H), Animal (A), 
Anatomy (At), Sex (Sx), Abstract (Ab), 
Anxiety (Ax), Hostility (Hs), Barrier (Br), 
Penetration (Pm), and Popular (P). 

3. The adjusting of weights to maximize 
the correspondence with an expert hand 
scorer. Consensus of three investigators was 
originally used after which the weights for 
all variables were reviewed by an expert hand 
scorer. In addition, Br and Pn weights were 
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reviewed by their originators, Fisher and 
Cleveland (1958). The current scoring dic- 
tionary is the sixth revision. 

4. Creation of pattern-scoring computer 
subroutines for certain variables, namely, 
Color, Shading, and Hostility. In the case of 
C and Sh, the relationship to FD modified 
the score; Hs was modified in terms of 
Animal or Human content. 

5. One score, J, was corrected by multiple- 
regression equation since satisfactory esti- 
mates could not be achieved by the weighted 
dictionary alone. Modifications of FD and L 
which were dependent upon Rejection were 
handled by computer subroutines. 

In the balance of this paper, tests of the 
validity and reliability of the computer com- 
pared with hand scores will be presented both 
for the original developmental population and 
for cross-validation populations. 


VALIDATION OF ORIGINAL SAMPLES 


Early validation attempts have been re- 
ported in a prior paper (Gorham, 1965). One 
series of studies utilized the original protocols 
which were the basis of the published paper 
on the group method of administration 
(Holtzman, Moseley, Reinehr, & Abbott, 
1963; Swartz & Holtzman, 1963). A second 
series utilized 145 group-administered records 
with a six-word response restriction. These 
records were hand scored by an expert scorer. 
The combined results of these early valida- 
tion efforts indicated that about 10 of the 
17 HIT variables being worked with could be 
scored at an acceptable level of confidence. 
During the past 12 months, the efficiency of 
scoring the remaining seven variables has 
been raised to satisfactory levels. 

The basic sample upon which validation 
was determined consisted of 145 students at 
the University of Delaware, all of whom took 
the HIT together, with the inkblot projected 
on a 10X 10 foot screen. The comparison 
of hand scoring with computer scoring is 
shown in Table 1. An inspection of the means 
and standard deviations reveals that on 11 
of the variables the values produced by com- 
puter and hand scores are essentially iden- 
tical. For the remaining six variables, some 
computer scores were too high and some too 
low, when compared with the hand scorer. 


TABLE 1 


A COMPARISON OF HAND SCORING AND COMPUTER 
SCORING FOR 17 HIT VARIABLES 

(Based on 145 six-word Form A group-administered 

college student records) 


Computer 
Hand scored scored 
— Vari- 
Mean SD able r Mean 
43.7 13.9 L 97 42.6 
18 33 R 1.00 1.8 
82.7 12.3 FD 84 794 
53 47 C .79 8.3 
19 17 Sh 58 3.8 
28.0 12.8 M 93 27.8 
2.5 2.0 I 67 2.5 
24.2 86 H gee 25.1 
22.8 7.0 A 90 25.1 
2.1 2.6 At 80 2.0 
O27) 3.1 Sx 94 0.6 
04 13 Ab 54 13 
78 44 Ax .19 8.5 
10.2 46 Hs 65 6.5 
5.7 29 Br 63 10.3 
2G Pn 62 3.8 
Ls 29 P 17 9.5 


Note.—Location self-scored by the students on the basis 
whole blot = 1, one-half of blot = 2, less than one-half of 
= 3; these scores were then converted to standard Locati 
scores by the formula L = L raw —(45 — R). 


The correlations between the computer 
an expert hand scorer were high enough t0 
indicate essential equivalence of the twi 
methods. Since both means and stan 
deviations between hand and computer scor 
ing can be adjusted, a more crucial test of 
invariance is found in the intercorrelations 
the 17 variables,’ and more specifically in thé 
factor structure of the intercorrelations. 
A factor analysis‘ was performed in whi 
eight factors were rotated to orthogondl 
structure. A summary of the highest rotate® 
factor loading is shown in Table 2. All fae 
tors with a loading of .50 or higher for eithe 
hand or computer scoring are shown. It 
be noticed that in every instance the idet 
tical variables make up each of the eight 
factors and that in most cases the loading 
are very similar for the two scoring methods 


2 This correlation matrix will be sent by the author 
upon request. j F | 

4A principal-axis factor analysis using uty 
communality estimates and subsequently rotating | 
first eight factors by Kaiser’s Varimax method 
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TABLE 2 


Hicuest ROTATED Factor Loapincs ror HAND AND COMPUTER SCORING 
(Based on 145 college students: Group administration) 
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Hand scoring 
I pog HI IV vV VI VII VII 
H = 86 Sx = 90 C=81 L= Hs = 84 Br = 79 At = 87 A = 88 
Pe 75 Ab = 74 Sh = 66 R=-53 Ax =75 FD = 45 
M=58 M = 66 
I= 54 
Computer scoring 
I Ir I IV v VI VII VIII 
H= 85 Sx = 89 C=85 L= Ax= 74 Br = 81 At = 89 A=90 
T=79 Ab = 45 Sh = 68 R=—-54 M=71 FD = 55 
P=78 Hs = 57 
M = 47 


RELIABILITY OF COMPUTER AND 
HAND SCORING 

Reliability and validity are closely inter- 
twined. In determining the validity of com- 
puter scoring, the hand scoring of an expert 
has been taken as the criterion which the 
computer must replicate. Such an approach 
presumes the infallibility of the hand scorer. 


In order, therefore, to secure a broader base 


as a criterion and at the same time to deter- 


TABLE 3 


A COMPARISON OF COMPUTER SCORING WITH THREE HAND SCORERS 
(Based on 50 group-tested college students) 


mine interscorer reliability of hand scorers 
when asked to score group records without 
inquiry material, two additional hand scorers 
were asked to score a subsample of 50 cases 
from the University of Delaware student 
protocols. These protocols were on standard 


Correlations with computer of 


three hand scorers 


Correlations between 
three hand scorers 


C and Vari- 
Candi Cand2 Cand3 Average able land2 tand3 2and3 Average 
99 93 95 97 L -94 -96 94 95 
1.00 1.00 99 1.00 R 1.00 99 99 99 
81 73 76 81 FD 78 87 86 84 
77 87 79 89 C 74 69 82 75 
199) -50 63 .70 Sh 14 50 49 +38 
192 75 «90 89 M 81 96 19 85 
wS 58 AGS 69 i 56 1 63 Kr 
.92 -90 89 92 H 95 96 94 95 
.90 87 90 91 A 94 94 .91 93 
89 19 75 87 At 82 86 15 81 
99 99 98 99 Sx 1.00 1.00 99 1.00 
60 -60 -61 -66 Ab .73 88 1 Sus 
80 x2 -69 81 Ax 73 73 .73 73 
64 57 46 59 Hs .79 83 79 80 
64 13 72 81 Br 61 55 -70 62 
56 +39 63 58 Pn 62 60 62 62 
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record forms so that the scorers knew the 
blot area for each response. 

The three hand scorers may be character- 
ized as follows: Scorer 1 was an expect scorer 
who has worked in Holtzman’s HIT Research 
Laboratory for 7 years; Scorer 2 was a clini- 
cal psychologist with 4 years of experience 
in administering and scoring the HIT; Scorer 
3 was a mature clinical psychologist, skilled 
in Rorschach techniques, who studied the 
HIT Scoring Manual and followed it pre- 
cisely in scoring these 50 records. It should 
be emphasized that responses were limited to 
six words and were without inquiry. The 
reliability coefficients of each rater compared 
with the computer and compared with each 
other are given in Table 3. 

In general, there was a surprisingly high 
degree of consistency between the computer 
and the hand scorers. The reliability of com- 
puter scoring, however, is best demonstrated 
by the fact that for 10 variables the correla- 
tion of computer with the average of the 
three hand scorers was equal to or greater 
than the interscorer reliability. For five addi- 
tional variables, the computer was only a few 


TABLE 4 


Opp-EvEN RELIABILITY COEFFICIENTS FOR HAND AND 
COMPUTER SCORING 


(Based on scores from 22 odd and 22 even HIT records 
of 145 college students) 


Computer scored Hand scored 
r Variable r 
88 L 88 
87 R -16 
712 FD 17 
61 C -60 
38 Sh 28 
64 M -70 
72 I 43 
10 H -15 
.63 A 64 
45 At 52 
85 Sx 94 
59 Ab -16 
41 Ax 49 
-35 Hs AT 
48 Br 39 
39 Pn 45 
40 P <25 


Note.—Correlations are corrected by Spearman- 
See to estimate the reliability coefficient ofa asbie RIE 
protocol. 


points lower. The two variables on which the 
computer was least efficient were Ab and Hs, 
The reliability coefficients were 11 points 
lower for Ab and 21 points lower for Hs. 

A second test of reliability was a split-half 
correlation between 22 odd and 22 even ink- 
blot responses. This test was made on the 
basic sample of 145 college students, scored 1 
by computer and by the expert hand scorer, 
The results are shown in Table 4. These find- 
ings clearly demonstrate that the computer 
matches hand scoring in terms of split-half_ 
reliability, and compares favorably with | 
Holtzman’s odd-even reliability coefficients. It 
is also made clear that the variable scores 
which were most difficult to simulate by com- 
puter are the ones for which hand scorer 
reliability is lowest, in particular Shading, 


Cross-VALIDATION 


If a computer system can score only six- 
word responses of group-administered Form 
A protocols, its usefulness is considerably” 
limited. Several questions need to be answered | 
before it can be said with confidence that 
computer scoring can be substituted for handi 
scoring: 

1, Can equally satisfactory results 
achieved on a new sample? ; 
2. Can Form B records be scored satisfa 
torily by a system which was developed) 
exclusively on Form A records? | 

3. Is the six-word response restriction upo 
which the computer scoring system wai 
developed a limiting factor? 

Several studies were carried out to answet | 
these questions. Since the original hand 
scored protocols upon which Holtzman 
(Swartz & Holtzman, 1963) validated the 
group-administration method were accessible, | 
this was an ideal sample upon which to cross? | 
validate the computer method, These records 
were not limited to any number of words an® | 
consisted of both Form A and Form B test 
records. One hundred Form A and 85 Form B | 
records were key punched and scored by tHe 
computer in the usual manner. Hand-scoree | 
summary cards for each subject were SUP” | 
plied by Holtzman’s Research Laboratory: 
The means, standard deviations, and interco | 
relations of computer and hand scoring % 
these records are presented in Tables 5 and & 


COMPUTER SCORING SYSTEM For INKBLOT RESPONSES 69 


TABLE 5 


Cross-VALIDATION ON GROUP 
ADMINISTERED Form A PROTOCOLS 


(Based on records of 100 college students) 


Computer 

Hand scored scored 

— Vari- 

Mean SD able r Mean SD 
40.2 17.0 L 97 40.4 17.4 
17° 34 R 99 16 34 
76.7 14.5 FD 81 79.2 12.6 
25.4 11.2 C 83 25.3. 10.3 
10.2 7.7 Sh 69 10.6 69 
30.6 14.4 M 89 36.7 14.6 
6.1 3,9 I .62 44 20 
26.0 9.2 H -88 26.6 84 
22.7 6.5 A 80 24.2 5.9 
31 29 At .13 31 24 
A Sx .62 of Tigge RA 
t jar A) Ab .65 Zl 2.3 
95 65 Ax 73 10.9 61 
9.7 54 Hs 75 75 44 
5.2 4.0 Br 50 13.5 5.1 
35) 2.7 Pn 66 Of mon, 
9.2 2.9 P 75 90 28 


Apparently, the versatility of the computer 
scoring system had been underestimated since, 
in general, the correlations between computer 
and hand scoring remained as high when the 
six-word restriction was lifted. In some in- 
stances the correlations increased. The com- 
puter evidently was able to benefit in the 
same manner as a human scorer when a 
richer protocol was provided. The fact that 
the computer matched hand-scored Form B 
equally as well as Form A indicates the 
equivalence of the two forms and the versatil- 
ity of the computer method. 


THE GENERALITY OF THE METHOD 


Perhaps the most important question 
which could be asked concerning an auto- 
mated inkblot-test method (group-adminis- 
tered and computer-scored) is: To what ex- 
tent does this method produce scores which 
are equivalent to the standard individually 
administered hand-scored HIT records? In 
order to answer this question a portion of a 
former study comparing group and individual 
administration (Holtzman et al., 1963) was 
Teplicated, scoring the group-administered 
records by computer rather than by hand 
as in the original study. In the original study, 


100 college students were given the group- 
administered HIT 1 week after having taken 
the individual test. Fifty took Form A group 
following Form B individual and vice versa. 
The individually administered hand-scored 
records were available from Holtzman’s Lab- 
oratory on IBM cards. The protocols for the 
group records were key punched and com- 
puter scored. When computer scoring was 
substituted for hand scoring for the group- 
administered records, the correlations were 
essentially the same as those obtained by the 
Holtzman group in the original study, 
namely, L:67 (69), R:57 (57), FD:56 
(60), C:33 (42), Sh:38 (31), M:55 (56), 
1:22 (23), H:43 (46), A:54 (53), At:32 
(36), Ax: 42 (44), Hs:37 (47), Br:42 (34), 
Pn:24 (27), P:28 (14).° In their discussion, 
Holtzman et al. (1963) conclude: 


5 Computer-scored reliability coefficient followed 
in parentheses by the multimethod coefficient found 
by Holtzman et al. These values are not corrected 
for attenuation due to test-retest unreliability since 
the comparison of computer and hand scoring is 
sufficiently demonstrated by the uncorrected values 
for both methods, 


TABLE 6 


Cross-VALIDATION ON GRouP-ADMINISTERED 
Form B Protocors 


(Based on records of 85 college students) 


Computer 
Hand scored scored 
TDR Seta We Vari- — 
Mean SD able r Mean SD 
38.0 14.0 L .90 38.8 16.3 
18 34 R 99 16 3.1 
78.2 14.4 FD 81 78.2 14,3 
25.5 10.5 C 81 28.2 10.6 
10.5 6.4 Sh 70 10.2 6.6 
31.9 15.0 M 85 35.4 15,1 
47 34 I 46 M 17 
232 91 H .87 23.4 8.6 
24.1 7.7 A 85 263 7.3 
34 3,3 At 84 30 34 
04 1.0 Sx 64 05 1,0 
08 2.6 Ab 60 17) 19 
10.9 6.9 Ax .79 11.7 6.3 
10.4 63 Hs -76 73 43 
5.7 9315 Br 59 13.0 43 
3.7 2.6 Pn 61 69 35 
79 28 fA 


Note.—The computer subroutine for scoring Form B Popular 
had not been written at the time of publication of this paper, 
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The high degree of comparability of the group 
method and the standard individual method sug- 
gests that the group version may be confidently sub- 
stituted for the more time consuming individual 
version when it is desirable to obtain a large number 
of inkblot protocols with minimum effort in a short 
period of time. It should be remembered, however, 
that there is no saving in scoring [p. 448]. 


Since the computer scoring method does 
provide saving of time and cost in scoring,® 
it makes the HIT readily available for large- 
scale research, Some of the areas in which 
the method seems particularly appropriate 
and potentially useful are large-scale screen- 
ing for prognostic and diagnostic purposes 
in educational and medical settings. 

In other papers, Moseley (1966) and 
Pardo, Davila, and Diaz-Guerrero (1966) 
discuss the computer scoring of inkblots in 
the Spanish language and the results of some 
cross-cultural comparisons growing out of the 
method. A Spanish Scoring Dictionary is al- 
ready operational, and preliminary studies 
show that 17 HIT Variables can be scored 
equally well directly in Spanish or after 
translation into English. Scoring dictionaries 
in other languages could be prepared if suf- 
ficient demand developed for direct scoring 
of protocols without translation. These appli- 


The present computer program for the IBM 
7090 scores approximately 20 records per minute 
of actual computer time in batches of 100 or more 
records. Card-to-tape, read-in and print-out, and 
punch-out times add a cost approximately equal to 
the computer time. A good key-punch operator can 
punch 6-10 records per hour. 


cations are of particular interest because they 
open up the possibilities of direct cross 
cultural comparisons of significant personal: 
ity variables without the vexing problems 
introduced by language translation. 
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SELF-REPORT OF HOSTILITY AND THE INCIDENCE OF SIDE 
REACTIONS IN NEUROTIC OUTPATIENTS TREATED 
WITH TRANQUILIZING DRUGS AND PLACEBO* 


ROBERT W. DOWNING anp KARL RICKELS 


University of Pennsylvania 


It is argued that the reporting of side reactions during the course of drug 
treatment may serve as an indirect mode of expressing hostility which is 
more likely to occur in patients whose personality orientation renders difficult 
direct hostility expression. In a sample of 47 Negro female clinic outpatients, 
it was hypothesized and found that side reactors obtained significantly lower pre- 
treatment Buss-Durkee Total Hostility scores than non-side-reactors (t = 2.78, 
df=45, p < .01) and that differences between side reactors and non-side- 
reactors were significantly greater on an Index of Direct Hostility than on an 
Index of Indirect Hostility (t=3.09, df = 45, p < .005). Differences between 
drug- and placebo-treated patients in relationships between side-reaction status 
and mode of hostility expression are also discussed. 


Theoretical considerations, reinforced by 
clinical experience, have led us to expect that 
(a) patients reporting side reactions during 
treatment will have described themselves as 
less hostile prior to treatment than patients 
who do not report side reactions and (b) the 
difference in pretreatment self-report of hos- 
tility between side reactors and non-side- 
reactors should be greater for direct than for 
indirect modes of hostile expression. 

Evidence has accumulated since the formu- 
lation of the frustration-aggression hypothesis 
by Dollard, Doob, Miller, Mowrer, and Sears 
(1939) and its modification by Miller 
(1941), that situations in which hostility 
is aroused and its expression inhibited are 
likely to produce an indirect expression of 
hostility. The doctor-patient relationship is 
generally recognized as one which inhibits 
the ready expression of direct hostility by 
the patient. Further, we have observed that 
the authority and power attributed to the 
doctor are particularly great among lower 
socioeconomic class patients of the type dealt 
with here. Graham, Charwat, Honig, and 
Weltz (1951) found that subjects expressed 
aggression less readily toward figures of 
higher authority or status than toward either 


1 Work was supported by USPHS Grants MH- 
02934 and MH-08957-8 and carried out at Phila- 
delphia General Hospital and the Hospital of the 
University of Pennsylvania. 
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peers or persons perceived as inferior in 
socioeconomic class. 

Watson, Pritzker, and Madison (1955) 
and Wahler (1959) have provided some evi- 
dence that the level of interpersonal hostility 
tends to be higher in neurotics than in nor- 
mals. As detailed in the procedure section, 
many aspects of the present drug-treatment 
situation had strongly frustrating elements, It 
might thus be expected that the present 
treatment context, combining as it did provo- 
cations toward hostility and the inhibition of 
its direct expression, was one in which the 
pressures toward the indirect expression of 
hostility were high. 

Considerable support involving widely di- 
vergent situations, subjects, and measurement 
techniques is to be found in the literature for 
the contention that indirect expression of 
hostility is more likely in those individuals 
whose personality orientation involves the 
curtailment of direct hostile expression (cf. 
Conn & Crowne, 1964; Feshbach, 1961; 
Hetherington & Wray, 1964; Jensen, 1957; 
Veldman & Worchel, 1961). In the course of 
previous work evaluating drug effectiveness 
(Rickels & Downing, 1962), a substantial 
group of patients was encountered who com- 
plained of side reactions while being treated 
with placebo, or whose side reactions during 
drug treatment took a form which could not 
readily be related to the usual physiological 
reactions produced by the drug and which 
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thus were considered to be affected by factors 
of a psychological nature. These patients 
gave the impression of being dissatisfied with 
therapy, but of finding it difficult to com- 
municate their dissatisfaction directly to the 
treating physician. Their manner was such as 
to suggest that critical communication, let 
alone the expression of anger directed toward 
any authority figure, would be for them 
quite difficult. 

Thus it appears reasonable to hypothesize 
that patients who report side reactions when 
treated with minor tranquilizers or placebo 
may use this reporting of side reactions as 
an indirect mode of hostility expression. It 
might further be expected that such patients 
have a personality orientation which makes 
it more difficult for them to express hostility 
directly. Grounds for making predictions con- 
cerning the orientation toward hostility of 
non-side-reactors would appear less secure, 
yet our clinical experience suggests that a 
major factor with them is a preference for a 
more direct form of hostility expression. 

Consideration of the pressures toward in- 
direct expression of hostility in the present 
treatment situation and of the differing modes 
of dealing with hostility expression posited in 
side reactors and non-side-reactors led us to 
the formulation of three hypotheses, If side 
reactors have more difficulty than non-side- 
reactors in giving direct expression to their 
hostility, they should be generally more reluc- 
tant to endorse questionnaire items which 
make reference to any form of hostile behav- 
ior. This leads to: 


Hypothesis 1. Those patients who report side 
reactions during the course of drug or placebo 
treatment will have obtained lower pretreatment 
Buss-Durkee Total Hostility scores than those who 
fail to report side reactions. 


Even though the orientation of side re- 
actors toward hostility expression is antici- 
pated to make less likely their endorsement 
of all questionnaire items descriptive of hos- 
tile behavior, the probability should be 
greater that they would endorse items de- 
scriptive of indirect than of direct modes of 
hostility expression. In contrast, non-side- 
reactors are not expected to show a selec- 
tively higher endorsement rate of Indirect 


Hostility items and might be expected to 
show a higher endorsement rate for Direct 
Hostility items. These considerations result 
in: 


Hypothesis 2. Differences between side reactors 
and non-side-reactors will be greater for an Index 
of Direct Hostility than for an Index of Indirect 
Hostility derived from the Buss-Durkee Scale. 


Patients receiving placebo are probably 
subjected to greater frustration of their 
desire for symptom alleviation. Also, placebo 
is free from the psychological effects which 
no doubt attenuate the role of personality” 
factors in the generation of side reactions, 
Consequently, we are led to: 


Hypothesis 3. The differences between side reactors 
and non-side-reactors in pretreatment Total Hostil- 
ity scores and Indexes of Direct and Indirect Hos- 
tility detailed in Hypotheses 1 and 2 will be found 
to be more pronounced in placebo-treated than in 
drug-treated patients. 


METHOD 
Subjects 


The subjects were 47 Negro females of lower- 
middle to lower socioeconomic class who sought 
treatment at outpatient clinics in either the Hos- 
pital of the University of Pennsylvania or Phila- 
delphia General Hospital. All were diagnosed as 
psychoneurotic and had anxiety and/or mild de- 
pression as prominent symptoms. Additional symp- 
tomatology varied in focus from reported difficulties 
in interpersonal relations to pronounced emphasis 
upon somatic complaints (cf. Downing, Rickels, 
Downing, & Robinson, 1964). Patients were ran- 
domly assigned to treatment with tranquilizers or 
placebo in accordance with a double-blind research 
paradigm. Completed Buss-Durkee protocols were 
available for 27 drug-treated and 20 placebo-treated 
patients. Analysis of variance revealed that treat- 
ment groups did not differ significantly either in 
Total Hostility score or in differences between 
Direct and Indirect Hostility scores. During the 
course of treatment, 26 of the subjects reported 
symptoms which could be classed as side reactions 
and 21 did not. The mean age for side reactors 
was 40.91 with a standard deviation of 13.10; for 
the non-side-reactors, mean age was 40.29 with & 
standard deviation of 10.78. There were no differ- 
ences in other demographic characteristics between 
those patients reporting side reactions and those 
who did not do so, 


Procedure 


Patients were initially screened by a psychiatrist 
who made a diagnosis and evaluated their suitability 
for participating in a double-blind research program 
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in which several tranquilizing drugs and a placebo 
were employed, Irrespective of the particular drug- 
evaluation procedure in which they were to par- 
‘ticipate, all patients able and willing to take a 
battery of instruments (designed to measure hostility 
and a number of other personality variables and 
attitudes) were tested (see Rickels & Downing, 1962 
for a complete description of all procedures utilized). 
The test battery required from 14 to 3 hours to 
administer and was presented to one to four patients 
at a time by a female technician. Although the 
patients’ attitudes toward the testing varied from 
interested participation to grudging compliance, they 
typically regarded the examiner as an authority 
figure, often addressed her as “Doctor,” and at- 
tempted to discuss their difficulties with her. Such 
factors as crowded clinic facilities and long waits 
frequently necessary before seeing the psycho- 
metrician and the treating physician introduced frus- 
trating elements into the treatment situation (cf. 
Downing et al, 1964). 

Since Negro females provided the largest group 
homogeneous in race and sex for which data were 
available, they were selected for the present analy- 
sis. As a result of the particular drug studies that 
were underway at the time the data were collected, 
the treatment agents received by the 26 side reactors 
and 21 non-side-reactors were as follows: Side 
reactors: 10 placebo, 16 minor tranquilizers; non- 
side-reactors: 10 placebo, 11 minor tranquilizers. 
Dosage levels were those recommended as thera- 
peutic by the drug manufacturer. 

The treatment period over which evaluations were 
made was of 4 weeks’ duration. During this interval, 
the patient was seen three times by a psychiatrist 
with each contact lasting approximately 15 minutes, 
During a pretreatment session, the patient was 
oriented to the treatment program and given a 
2 weeks’ supply of medication. Her condition was 
evaluated by the doctor who then completed a 
number of ratings of the severity and nature of 
her symptomatology to provide data for assessing 
drug effectiveness, The patient was seen again after 
2 weeks and after 4 weeks on medication, At each 
of these patient contacts, the psychiatrist reviewed 
the patient’s complaints with her, inquiring about 
how much better or worse she felt since beginning 
to take the medication. He then asked “How else 
did the drug make you feel?” Somatic complaints 
such as sleepiness, dizziness, headache. and pares- 
thesia attributed by the patient to her medicine were 
recorded on a form provided for that purpose. 
These complaints were considered as side reactions 
when they had not been recorded at the pretreat- 
ment session. Thus to be classed as a side reactor, 
it was necessary for the patient to volunteer at 
least one symptom which she attributed to her 
medication and which the treating psychiatrist re- 
garded as distinctive from her original complaints. 
The actual number of side reactions ranged from 
1 to 4 per patient, with a median of 2. Conse- 
quently, although the reliability of side-reaction 
identification was not obtained, lack of reliability 
would serve to attenuate results rather than to 


introduce systematic bias. The treating psychiatrist 
was at no time aware of the patients’ hostility 
scores nor of the hypotheses under evaluation. 


Measures 


Hostility measures were obtained from the Buss- 
Durkee Hostility Inventory (Buss, 1961; Buss & 
Durkee, 1957). In addition to a Total Hostility 
score, an Index of Direct Hostility was obtained 
by averaging scores from the Assault and Verbal 
Hostility subtests and an Index of Indirect Hostility 
by averaging scores from the Indirect Hostility, 
Irritability, and Resentment subtests. In addition, 
to render Indirect Hostility scores (9.33 items) 
comparable in magnitude to Direct Hostility scores 
(11.50 items), Indirect Hostility scores were multi- 
plied by a factor 11.50/9.33, or 1.232. 


RESULTS AND DISCUSSION 


In the formulation of Hypothesis 1, it was 
argued that side reactions might be consid- 
ered an indirect expression of hostility re- 
sorted to by patients who find it difficult to 
express anger directly toward their phy- 
sician, Consequently, it was expected that 
patients manifesting side reactions during the 
course of treatment should have been more 
reluctant to endorse questionnaire items de- 
scriptive of any form of hostile behavior 
and therefore should have obtained lower 
pretreatment Buss-Durkee Total Hostility 
scores. Table 1 presents means for the sev- 
eral hostility scores for side reactors and non- 
side-reactors, Reference to this table reveals 
that Total Hostility score was significantly 
higher for non-side-reactors than for side 
reactors (t = 2.78, p < .01). The hypothesis 
was thus confirmed. 

It was argued that side reactors would 
less readily endorse questionnaire items re- 
lating to Direct Hostility than items relating 
to Indirect Hostility expression and that non- 
side-reactors would either show no prefer- 
ences for Direct or Indirect Hostility items 
or perhaps more readily endorse Direct than 
Indirect Hostility items. 

2 Negativism and Suspiciousness, the two remain- 
ing subscales administered, were not included in 
predictions concerning the differential role of Direct 
and Indirect Hostility. Negativism, the shortest of 
the subscales, containing but five items, was found 
by both Buss (1961) and by us to have the lowest 
test-retest reliability of all the subscales. Suspicious- 
ness, with its probable involvement of projective 
defenses, did not seem to fit readily into the Direct- 
Indirect Hostility classification. 
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TABLE 1 
I S OF DIRECT AND Inprrect HOSTILITY, MEAN DIFFERENCES BETWEEN : 
ie Ore Re omer HER INDEXES, AND MEAN TOTAL HosrtILITY SCORES FOR Y 
PATIENTS CLASSIFIED BY SIDE-REACTION STATUS AND TREATMENT AGENT 
Drug-treated group Placebo-treated group Total group 
Side  Non-side- Side  Non-side- Side Non-side- 
reactors reactors reactors reactors reactors reactors 
(N=16) (N=11) ¢ (N=10) (N=10) ¢ (N=26) (N=21) 1+ 
Direct È TA 
hostility 3.75 SA 3.21s 2.85 5.20 3.18% 3.40 : a 
Indirect 
hostility 4.21 5.33 1.45 4.15 4.10 4 4.19 4.73 86 
Direct — 
indirect 
hostility —.46 44 1.57 —1.30 1.10 2.94** —.79 «477 3.09% 
Total 
hostility 23.38 34.27 2.69** 22.50 28.50 1.28 23.05 31.52 2.78 
*p= 
ah = 03: 
** b = 101, 
srb = (001. 


Hypothesis 2, derived from this argument, 
predicted greater differences between side re- 
actors and non-side-reactors in Direct than in 
Indirect Hostility. The data presented in 
Table 1 are generally consistent with this 
hypothesis. For the total patient group, the 
mean Index of Direct Hostility is signifi- 
cantly higher for non-side-reactors than for 
side reactors (t= 4.54, p< 001). Mean 
Direct Hostility is also significantly higher 
in non-side-reactors than in side reactors 
when drug-treated (¢= 3.21, p<.01) and 
placebo-treated (¢ = 3.18, ~ < .01) patients 
are considered separately. However, side re- 
actors and non-side-reactors do not differ 
significantly in mean Index of Indirect Hostil- 
ity in either the total patient group or the 
two treatment groups separately considered. 
In addition, the difference between side re- 
actors and non-side-reactors in the difference 
between Indexes of Direct and Indirect Hos- 
tility attains statistical significance (t = 3.09, 
Ż < .005 for total patient sample. 

Hypothesis 3 was based upon the greater 
frustration of desire for symptom alleviation 
and the lack of contamination by drug- 
induced physiological effects of reports of side 
reactions found in the placebo-treated group. 


It posits greater differences between side Te 
actors and non-side-reactors of the kind 
specified in Hypotheses 1 and 2 in the placebo- | 
treated than in the drug-treated group. This 
hypothesis receives only partial support. AS” 
anticipated, it was in the placebo-treated) 
group that side reactors and noi 
reactors differed more sharply in Direct that 
in Indirect Hostility. However, contrary t0 
expectation, it was in the drug-treated rathet 
than in the placebo-treated group that the 
difference between side reactors and non-side 
reactors in pretreatment Total Hostility scote 
was the greater. F 
The data obtained would seem to provide 
general support for the contention that side 
reactors and non-side-reactors differ in theif 
orientation toward the expresison of Direct 
and Indirect Hostility. Stifler (1965) has 
noted that highly anxious college students 
obtain the highest scores on the Indirect 
Hostility subscales of the Buss-Durkee, while 
less anxious students obtain highest scores 0 


Direct Hostility scales. His interpretation 8 


that the more anxious subjects, because 0 
the greater threat which the direct expression 
of hostility involves for them, give predomi- 
nantly indirect expression to their anger. He 
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maintains that less anxious individuals, less 
threatened by hostile behavior, express their 
angry feelings in a more direct way. 

Since the links between the authors’ overall 
rationale and formulation of specific hypothe- 
ses are at points admittedly tenuous, it is not 
surprising that more detailed predictions are 
but partially confirmed. More specifically, in 
the drug-treated group the difference between 
side reactors and non-side-reactors in Direct 
Hostility score is greater than their difference 
in Indirect Hostility score, but not signifi- 
cantly so, The mean pretreatment Direct 
Hostility score of the drug-treated side re- 
actors (3.75) is higher than that of the 
placebo-treated side reactors (2.85), with the 
result that for drug patients the difference 
between side reactors and non-side-reactors 
(—.46) is smaller than the corresponding dif- 
ference for placebo-treated patients (—1.30). 
Since the physiologically produced discomfort 
resulting from tranquilizers varies widely 
from patient to patient, it is possible for 
side reactions reported by drug-treated pa- 
tients to range from those based upon little 
or no physiologically induced discomfort to 
those based upon physiologically based dis- 
turbances, Consequently, in the drug-treated 
group, it might be speculated that two 
mechanisms may be contributing to the re- 
porting of side reactions: (a) Patients with 
difficulty in giving direct expression to hostil- 
ity and who are subjected to little or no 
pharmacologically produced discomfort may 
employ side reactions, as it has been con- 
tended that placebo patients do, as an in- 
direct expression of hostility; and (b) patients 
who more readily give direct expression to 
hostility and who are subjected to pharma- 
cologically induced discomfort may be more 
motivated by their hostility drive to com- 
plain about the discomfort they are feeling. 
Such speculations are consistent with the ob- 
tained data, but, of course, not definitely 
supported by them. 

In considering possible alternative explana- 
tions for the present results, one might argue 
that patients who do not report side re- 
actions are actually individuals with a gen- 
erally low level of somatic symptomatology, 
and that those patients who report side re- 
actions are individuals who have a larger 


number of somatic complaints. Should this 
prove to be the case and should patients 
with a limited amount of somatic syptomatol- 
ogy for some reason report themselves to be 
more hostile than patients with more exten- 
sive somatic symptomatology, then the pres- 
ent results might be accounted for without 
recourse to the argument that side reac- 
tions represent an indirect mode of hostility 
expression. 

Data were available for 41 of the present 
patient sample of 47 on a 64-item symptom 
check list, a scale slightly modified from 
Fisher, Cole, Rickels, & Uhlenhuth (1964) 
which requires the patient to rate himself on 
a 4-point scale on each of 26 somatic and 
38 psychological symptoms commonly en- 
countered in neurotics. If the differences in 
Total Hostility level and in Direct-Indirect 
Hostility differences found between side re- 
actors and non-side-reactors can be accounted 
for in terms of differences in level of somatic 
symptomatology, then: (a) Patients with a 
somatic-symptom level score below the dis- 
tribution median should obtain higher Total 
Hostility scores than patients with a somatic- 
symptom score above and below the dis- 
tribution median and (b) patients with 
somatic-symptom levels above and below the 
distribution median should show a greater 
difference in Direct than in Indirect Hostility. 
Table 2 presents mean Total Hostility scores 
and mean Indexes of Direct Hostility and In- 
direct Hostility for the 20 high and 21 low 
somatizers. It will be noted that the mean 
Total Hostility score for the low somatizers 
(24.30) is below, not above, that for the 
high somatizers (30.33). Inspection of the 
table also reveals that the difference between 
high and low somatizers in Direct Hostility 
score is less, not more, than the Indirect 
Hostility score difference. The ¢ for the dif- 
ference in Total Hostility score is 1.66, that 
for the difference in Direct-Indirect Hostility 
Indexes is 1.43, neither of which reaches sig- 
nificance at the 10% level. Thus, no support 
is found for the view that differences in 
Total Hostility score or in the directness of 
hostility expression between side reactors and 
non-side-reactors is a function of somatic 


symptomatology. 
The possibility warrants consideration that 
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TABLE 2 


PRETREATMENT Mean Buss-DURKEE TOTAL HOSTIL- 
ITY SCORES AND MEAN INDEXES OF DIRECT AND IN- 
DIRECT HOSTILITY FOR PATIENTS HIGH (ABOVE 
Mepran) Anp Low (AT MEDIAN OR BELOW) IN 


SOMATIZING TENDENCY 
Direct Indirect Total 
hostility hostility hostility 
High somatizers 
(N = 20) 4.74 4.11 30.24 
Low somatizers 
(N = 21) 4.30 3.02 24.30 


the very hostile patient may arouse sufficient 
antagonism in his physician that an atmos- 
phere is created in which the physician 
subtly discourages the Presentation of com- 
plaints about medication. Physician ratings 
on a 5-point scale of liking for the patient 
were available at the pretreatment, 2-week 
and 4-week points. Of the six comparisons 
(two medication groups at three points in 
time), four reflect greater physician liking 
for the non-side-reactors, The highest ¢ value 
for any of the comparisons is 1.71, which 
for 18 df barely misses significance at the 
10% level, Therefore, it may at least be 
said that no support is found in these data 
for the presence of an artifact of the type 
here considered. 

A qualifying note must be added concern- 
ing the findings of the present investigation. 
The sample studied consists entirely of 
Negro females, While Negro females consti- 
tute a large segment of clinical Patients in 
the Philadelphia area and other large urban 
centers with considerable Negro population 
and hence represent a significant group about 
which generalizations need to be made, pres- 
ent study findings do not permit generaliza- 
tion to other race-sex subgroups. Since it 
is to be expected that the manner of express- 
ing hostility has imposed upon it differing 
constraints for Negroes than for whites, and 
for females than for males, there is reason 
to expect that relationships between side Te- 
actors and manner of hostility expression may 
be different for patients of differing race-sex 
group membership. 
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NOTES AND COMMENTS 
AN EVALUATION OF THE TRAIL MAKING TEST! 


SIDNEY A. ORGEL anv ROBERT D. McDONALD 2 


Veterans Administration Hospital and State University of New York 
Upstate Medical Center, Syracuse, New York 


3 groups of Veterans Administration hospitalized white males, homogeneous 
as to age, sex, and education were examined with the Trail Making Test. 
The 3 groups each containing 21 Ss are identified as follows: brain-damaged, 
mixed neuropsychiatric, and hospitalized controls. The null hypothesis that 
the Trail Making Test scores of the 3 groups would not differ significantly 
could not be rejected. Although Part A of the Trail Making Test differs 
significantly from Part B, the differential psychological attributes being tapped 
by the 2 parts of the test are unknown. It is suggested that if the Trail 
Making Test is to be applied in a clinical setting, considerably more informa- 


tion is needed concerning test performance as a function of age. 


The literature contains numerous reports of 
the efficacy of the Trail Making Test as a potent 
assessment device differentiating organic and 
nonorganic patients (Armitage, 1946; Korman & 
Blumberg, 1963; Reitan, 1955, 1958). Alvarez 
(1962) reported that deficits in performance- 
time scores on the Trail Making Test are more 
likely to be a function of cerebral lesions on 
perceptual and motor integrations than of low- 
ered motivation due to depressed mood in the 
absence of organic involvement. 

Other studies have indicated that among the 
brain-damaged groups, those with static lesions 
generally perform at levels superior to those with 
acute lesions (Fitzhugh, Fitzhugh, & Reitan, 
1962). Significant correlations between Wechsler- 
Bellevue subtests and Trail Making Test per- 
formance has also been reported (Reitan, 1959). 
Differential effects of lateralized lesions upon 
Trail Making Test scores (Reitan & Tarshes, 
1959) and effects of chronic versus current 
lesions on test performance (Fitzhugh, Fitzhugh, 
& Reitan, 1963) have also been noted. 

Brown, Casey, Fisch, and Neuringer (1958) 
were unable to confirm these positive findings 
concerning the Trail Making Test. These authors 
state, “inasmuch as the diagnostic problem fre- 
quently encountered in clinical practice is one 
of distinguishing between psychosis and brain 
damage, the present findings suggest a verdict 
of ‘no value’ for the Trail Making Test in this 
context.” However, it is not clear from the report 


1The authors wish to express their thanks to 
Diana Norcross and Patricia Hogan for their assist- 
ance in collecting some of these data. 

2.Now at HumRRO, Presidio of Monterey, Cali- 
fornia. 
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of this investigation whether the variables of 
age and sex were controlled. Brown et al, (1958) 
also noted a personal communication from I, W, 
Scherer and J. F. Winne indicating that these 
investigators were not able to relate Trail Making 
Test performance and the cerebral insult of the 
postlobotomized patient. 

The aim of this study was to evaluate the 
Trail Making Test among clinically relevant 
groups while controlling age, sex, and education. 
Stating the hypothesis in null form: groups of 
neuropsychiatric, brain-damaged, and hospitalized 
control patients who do not differ from each 
other in terms of age and years of schooling will 
have Trail Making Test scores which do not 
significantly differ. 


METHOD 


Subjects 


The subjects were all white, male veterans cur- 
rently hospitalized at the Syracuse Veterans Ad- 
ministration Hospital. The 21 subjects (Ss) com- 
prising the brain-damaged group all presented 
neurological and/or pneumoencephalographic evi- 
dence of cortical damage. The diagnostic categories 
of the brain-damaged group included intracranial 
neoplasm, four; intoxication (alcoholism), four; 
circulatory disturbance other than arteriosclerosis 
(cerebral vascular accident), three; convulsive dis- 
order, two; chronic brain syndrome of unknown 
cause, two; cerebral arteriosclerosis, two; intracranial 
infection, syphilis, multiple sclerosis, and senile brain 
disease, one each. The 21 Ss within the neuropsy- 
chiatric group included the following diagnostic 
categories: personality trait disturbance, six; psycho- 
neurotic (obsessive-compulsive), three; psychoneu- 
rotic (conversion), two; psychoneurotic (anxiety), 
two; psychoneurotic (reactive depression), one; 
schizophrenia (chronic, undifferentiated), two; 
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schizophrenia (paranoid), three; schizophrenia (cata- 
tonic), two. The diagnostic categories of the 21 pa- 
tients comprising the hospitalized control group in- 
cluded humerus fracture, three; arthritis, three; ulcer, 
two; tibia fracture, two; hammertoe, femur fracture, 
hypertension, rheumatic heart disease, atypical chest 
pain, lymphoma, calcium left shoulder, diabetes, 
renal stone, infected left earlobe, ganglion left second 
toe, each one. 3 

Mean ages of the brain-damaged, neuropsychiatric, 
and hospitalized control groups were 43.19 (SD 
= 13.18), 40.05 (SD = 10.29), 43.14 (SD = 14.56). 
Corresponding means for years of schooling were 
11.62 (SD=345), 12.14 (SD=3.04), 11.10 (SD 
= 1.09). No significant intergroup differences in age 
and years of schooling were found (F with 2 and 
60 df =0.415 and 0.413, respectively). 


Procedure 


The Trail Making Test consists of two parts that 
require § to draw lines, in order, connecting 25 
circles, In Part A of the test, the circles are num- 
bered from 1 to 25. Part B contains circles that 
are numbered from 1 to 13 and, in addition, circles 
that are lettered from A to L. The S must draw 
lines from 1 to A, A to 2, 2 to B, etc. The score 
for each part is based on the performance time in 
seconds. Errors count only in that S spends time 
in correcting them. 

All Ss were individually administered the Trail 
Making Test under standard conditions as described 
by Reitan (1956), The brain-damaged and neuro- 
psychiatric patients were examined with the Trail 
Making Test as part of routine referral for psycho- 
logical evaluation. Control Ss were randomly selected 
from hospital case records which indicated absence 
of neurological or psychiatric histories. There were 
eight different test administrators in all with ap- 
proximately equally distributed testing among the 
several examiners over a period of 18 months, 


RESULTS AND Discussion 


Table 1 presents the results of an analysis of 
variance of the Trail Making Test performance 
of the three groups. Inspection of these data 
indicate that the null hypothesis cannot be re- 
jected. Clinically hospitalized groups of neuro- 
psychiatric, brain-damaged, and control Ss did 
not differ significantly from each other in their 
Trail Making Test performance. Generally, these 
data, with age and sex controlled, support the 
results reported by Brown et al. (1958). At least 
in terms of a Veterans Administration all-male 
population of patients, homogeneous as to age 
and education, the Trail Making Test appears 
to have little value in distinguishing between 
brain-damaged and psychiatric impairment, Fur- 
thermore, a clinically “normal” group did not 
perform significantly differently from the brain- 
damaged or psychiatric populations on the Trail 
Making task. 


Notes AND CoMMENTS 


TABLE 1 


ANALYSIS OF VARIANCE OF TRAIL MAKING Test Pre 
FORMANCE OF BRAIN-INJURED, NEUROPSYCHIATRIC, 
AND HOSPITALIZED CONTROL GROUPS 


Mean 
Source df square F 

Between groups 2 18.20 1.95 
Between Ss in same 

group (error) 60 9.34 
Between Parts A and B 1 672.07 223.65% 
Interaction Parts 

X Groups 2 4.07 1.36 
Interaction Pooled Ss 

X Parts (error) 60 3.01 


*p <.001, 


It is noteworthy that Part A of the Trail 
Making Test differs significantly from Part B. 
The specific psychological behaviors that account 
for successful performance on Parts A and B of 
this test remain obscure. 

Our results are not in accord with Reitan 
(1955, 1958), and this discrepancy is particu- 
larly difficult to comprehend in view of the simi- 
larity between the patients in his brain-damaged 


t 


and control groups and those reported in this 


study. The one difference between populations 
that does emerge is that our patients tended on 
the average to be approximately 10 years older 
than those of Reitan. 

Tt is suggested that if the Trail Making Test 
is to be applied in a clinical setting, considerably 
more information is needed concerning test per 
formance as a function of age and that additional 
evidence as to the differential psychological 
attributes being tapped by the two parts of the 
test is sorely lacking. 


REFERENCES 


Atvarez, R. R, Comparison of depressive and brain- 
injured subjects on the Trail Making Test. Per- 
ceptual and Motor Skills, 1962, 14, 91-96. 

ARMITAGE, S. G. An analysis of certain psychological 
tests used for the evaluation of brain injuty: 
ee Monographs, 1946, 60(1, Whole No: 
277). 


Brown, E. C., Casey, A., Fiscu, R. I, & NEURINGEM 
C. Trail Making Test as a screening device for thé 


detection of brain damage. Journal of Consulting 


Psychology, 1958, 23, 469-474. 
FITZHUGH, K. B. Firznuca, L. C., & Rerran, R. M ' 
Relation of acuteness of organic brain dysfunction 


NOTES AND COMMENTS 79. 


to Trail Making Test performances. Perceptual 
and Motor Skills, 1962, 15, 399-403. 

Firznucn, K. B., Firzuuen, L, C., & Rerran, R. M. 
Effects of “chronic” and “current” lateralized and 
non-lateralized cerebral lesions upon Trail Making 
performances. Journal of Nervous and Mental Dis- 
ease, 1963, 137, 82-87. 

Korman, M., & Brumserc, S. Comparative efficiency 
of some tests of cerebral damage. Journal of Con- 
sulting Psychology, 1963, 27, 303-309. 

Rerran, R. M. The relation of the Trail Making 
Test to organic brain damage. Journal of Con- 
sulting Psychology, 1955, 19, 393-394. 

Rerran, R. M. Trail Making Test: Manual for ad- 


Journal of Consulting Psychology 
1967, Vol. 31, No, 1, 79-82 


ministration, scoring and interpretation, Indiana 
University, 1956. (Mimeo) 

Rerran, R. M. Validity of the Trail Making Test 
as an indicator of organic brain damage. Per- 
ceptual and Motor Skills, 1958, 8, 271-276. 

Rerran, R. M. Correlations between the Trail 
Making Test and the Wechsler-Bellevue Scale. 
Perceptual and Motor Skills, 1959, 9, 127-130. 

Rerran, R. M., & Tarsus, E. L. Differential effects 
of lateralized brain lesions on the Trail Making 
Test. Journal of Nervous and Mental Disease, 
1959, 129, 257-262. 


(Received July 1, 1965) 


MEDIATED AND PRIMARY STIMULUS-GENERALIZATION 
BASES OF SEXUAL SYMBOLISM? 


AUSTIN JONES anv DAVID S. LEPSON 
University of Pittsburgh 


Ss were asked to make associations of masculinity or femininity to simple 
geometric figures reproduced in such a way as to encompass both the primary 
stimulus-generalization basis of sexual symbolism emphasized by Freud and 
a mediated basis unrelated directly to psychoanalytic theory—the color di- 
chotomy of black-gray. The materials were constructed so that each response 
would be scored as consistent with 1 basis of symbolism only. The results 
showed that both the primary and mediated bases of symbolism were very 
significantly effective in determining responses (p < .00001), but that their 
effects were almost exactly equal in magnitude, thus suggesting that an em- 
phasis on primary stimulus-generalization explanations of sexual symbolism 
may be somewhat arbitrary. Data are also presented relevant to the com- 


parisons of sex of Ss and of psychiatric and nonpsychiatric groups. 


The Freudian view that pointed, elongated 
objects symbolize the penis and that rounded, 
enclosing objects symbolize the vagina has re- 
ceived some support from experiments in which 
associations of “maleness” and “‘femaleness” were 
given to simple geometric figures of various 
pointed and elongated or rounded and enclosing 
shapes (Jones, 1956, 1962). Subjects (Ss) re- 
sponded in accordance with the Freudian hy- 
pothesis significantly more often (p<.0001) 
than would be expected by chance. Male Ss re- 
sponded significantly more frequently in support 
of the hypothesis than did females, and nonpsy- 
chiatric Ss more frequently than did psychiatric 
Ss matched for age and education. General sup- 
port for the Freudian hypothesis has also been 
reported by Lessler (1964). 4 

1 The authors are indebted to Herbert Levit, for 
the availability of patients as research subjects at 
Dixmont State Hospital, Glenfield, Pennsylvania. 


The statements of Freud concerning sexual 
symbolism, as in The Interpretation of Dreams 
(1958), emphasize the primary stimulus-generali- 
zation basis of symbolism, and the studies cited 
above (Jones, 1956, 1962) dealt only with that 
aspect. Although the results provided evidence 
that a primary stimulus-generalization type of 
sexual symbolism does occur, a question may be 
raised as to the importance of this type of sym- 
bolism relative to other, mediated types. The 
developmental experiences of most humans have 
presumably provided for the learning of many 
different symbolic relationships concerning sexu- 
ality, and it appears at present that we know 
little about their relative response strengths. The 
purpose of the present study is to take a first 
step in this direction by comparing the response 
strengths associated with sexual symbolism of 
two different types, the primary stimulus-gen- 
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eralization type emphasized by Freud, and a 
mediated type not specifically discussed in psy- 
choanalytic theory (although this body of theory 
does deal with the general class of mediated 
symbolism). 

The strategy adopted for this purpose required 
that Ss be provided with stimuli representing 
simultaneously the two “competing? bases of 
symbolism and that the S’s response to each 
stimulus be clearly related to one and only one 
of the two bases, The geometric stimuli used in 
the prior study were employed here to represent 
the primary stimulus-generalization basis of sym- 
bolism, A mediated dimension of symbolism was 
then sought which would meet the following cri- 
teria: (a) that the dimension be independent of 
any primary stimulus dimension pertaining to 
the genital organs, (b) that the principle of 
mediation be unknown to Ss in advance, (c) that 
the stimuli elicit a high proportion of response 
consistent with the mediated symbolic principle 
when presented alone, and (d) that the stimuli 
representing the mediated symbolism be of such 
nature as to be readily combined with the geo- 
metric figures which constitute the primary 
stimulus-generalization stimuli. 

The four criteria noted above were found, in 
preliminary experimentation, to ‘be satisfied by 
the color dichotomy black-gray. Subjects were 
presented a series of cards on which were painted 
stimuli of either black or gray tempera. Each 
stimulus consisted of a filled square with slightly 
rounded corners (a shape intended to be mini- 
mally related to primary stimulus-generalization 
hypotheses of sexual symbolism). Each S$ re- 
ceived five trials of each “color” in randomized 
order, and was asked to give rapid associations 
of maleness or femaleness. Of 20 college Ss, 10 
male and 10 female, all but 1 associated each 
black stimulus with maleness and each gray 
stimulus with femaleness, Although many Ss 
were somewhat unclear as to the rationale for 
their response, it appeared generally that the 
black stimuli led to associations of “stronger 
color” and hence to the greater strength of males, 
It appeared unlikely that this dimension of sexual 
symbolism was understood by the Ss in advance, 
although conclusive data on this point were not 
available. 


The strategy employed in this study is in cer- 
tain respects similar to that of Lessler (1964), 
who asked Ss to respond to figures possessing 
both Freudian and “cultural” referents, which 
were for certain figures congruent but incon- 
gruent for others. The present study differs from 
Lessler’s in that the comparison js of primary 


stimulus and mediated generalization bases ol 
symbolism, with the mediated basis being rela 
tively independent of any generally understoo 
culturally determined sex associations (such 
those pertaining to baseballs and rolling pins, 
for example, two symbols employed in the Lessl 
study), 


The mediated dimension of symbolism was repre- 
sented in the materials by preparing two series of 
the geometric figures on white 3 X 5 inch cards, one 
painted in black tempera, the other in gray. In the 
resulting set of 20 figures were 10 for which either 
basis of symbolism would be expected to elicit the 
same sex associations, the five pointed or elongated 
forms painted black (and thus male according {0 
both dimensions) and the five rounded or enclosing 
forms painted gray (and thus female). The remain- 
ing 10 figures are each characterized by the presenct 
of the male atttribute from one dimension of sym 
bolism and the female attribute from the other. This 
the five pointed, elongated shapes reproduced in gray 7 
would be expec elicit associations of malenes” 
or femaleness acc g to whether the primari 
stimulus-generalization basis of response or the nell 
ated basis were dominant, and conversely for the 
five rounded or enclosing shapes reproduced in black 
These latter 10 figures will be referred to as tht 
“test” figures; the first 10 described were aló | 
presented to the Ss and will be referred to as the 
“buffer” figures. In the balance of the paper, the 
black-gray dimension will be referred to as the 
mediated basis, and the primary stimulus-generaliza 
tion dimension simply as the “primary” basis a 
sexual symbolism. 

Procedure. Subjects were seen individually by the, 
experimenter (E), The 20 stimuli were presented on? 
at a time iẹa fixed, randomized order. The Ss pi 
requested to associate immediately either masculin | 
or feminine personality to each of the stimuli. TM ; 
exact instructions to Ss are given in a prior rep 
(Jones, 1965), 
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RESULTS 


Responses to the 10 test figures were tabu- 
lated in terms of their consistency with the 
primary and with the mediated bases of sym- 
bolism. The null hypothesis may be tested in 
this situation by „comparing either set of scores 
with the chance expectancy of 5.0 (i.e., five of 
the responses to the test figures being consistent 
with each basis). No significant difference was 
found between the mean number of responses 
(4.75) consistent with the mediated basis and 
with the chance expectancy of 5.0 (#=.91, 
df=79, 4>p>.3). As ¢ is identical for the 
opposite comparison (responses consistent with 
the primary basis versus chance expectancy), 
this has the meaning of no significant difference 
between the number of responses consistent with 
the primary and with the mediated bases of sym- 
bolism. Similar results were obtained when each 
of the four subgroups of Ss were considered 
separately, Analysis of variance showed no sig- 
nificant differences between the sexes of Ss and 
between the college and psychiatric groups (both 
Fs < 1,0). 

The preceding analysis assumes that each re- 
sponse to the test figures is consistent with one 
of the two bases of symbolism ‘defined in this 
study, but it is at least conceivable that some 
unintended cue aspects of the figures determined 
responses and that neither of the intended bases 
of symbolism actually elicited significant sexual 
symbolic response. This possibility regarding the 
primary basis would appear very unlikely in light 
of the results of the prior study (Jones, 1956), 
and the preliminary experimentation of this study 
strongly supports the mediated basis of symbol- 
ism. However, it is possible to demonstrate di- 
rectly the operation of each basis of symbolism 
within the responses to the nt test figures. 
The number of masculine associations given to 
the five pointed, elongated figures painted black 
(buffer figures) was compared with the number of 
masculine associations given to the five pointed, 
elongated figures painted gray (test figures). If 
the mediated basis was effective in determining 
symbolic response, the mean number of mascu- 
line associations should be higher to the buffer 
than to the test figures. This expectation was 
demonstrated, the means being 3.88 and 2.75, 
respectively (ż= 5.71, df=79, p< .00001). 
Similar ¢ tests for each subgroup of Ss were 
carried out. The differences, all in the same 
direction, were each significant at or beyond the 
05 level, except for the psychiatric females 

- (bp <.20). This procedure was carried out also 
with respect to the mean number of feminine 


associations given to the five rounded, enclosing 
figures painted black. With all Ss pooled, the 
mean number of feminine associations to the 
buffer figures, 3.68, was shown to be reliably 
greater than that to the test figures, 2.50 
(t= 5.53, df =79, p< .00001), again demon- 
strating the operation of the mediated basis. 
Separate ¢’s for the subgroups each yielded p’s 
at or beyond the .05 level. 

Parallel analyses were made concerning the 
relationship between the two classes’ of forms 
(primary basis) and the sex category of associa- 
tions when the mediated dimension is held con- 
stant. The effectiveness of the primary basis in 
determining sexual symbolic response received 
similar verification, the two pooled #’s being 8.14 
and 7.50, with df= 79 and p< .00001 in each 
case, 

The extent to which response consistent with 
the mediated dimension was related to sex of Ss 
and to diagnostic category (psychiatric vs. non- 
psychiatric) was assessed by an analysis in which 
the data consisted of difference scores computed 
as the number of responses to buffer figures con- 
sistent with both the primary and mediated 
bases, minus the number of responses to the test 
figures consistent with the primary basis, Stated 
informally, the difference scores represent the 
change in the sex of the responses as a function 
of the change in “color” of the figures. There 
was no significant difference between sexes of 
Ss (F<1.0). The difference between the psy- 
chiatric and nonpsychiatric groups showed mar- 
ginal significance (F = 3.89, df = 1/76, p <.06), 
the means for the two groups being 1.58 and 
3.02, respectively. A parallel analysis of re- 
sponses consistent with the primary dimension 
was carried out. There was again no significant 
difference between sexes (F = 1.38, df= 1/76, 
p> .05). In contrast to the preceding analysis, 
the difference between the psychiatric and non- 
psychiatric Ss did not approach significance. 

Analysis of only the buffer figures permits 
comparisons of the frequency of responses which 
are consistent with neither basis of symbolism; 
that is, responses of “masculine” to rounded, en- 
closing forms reproduced in gray, and “feminine” 
to pointed, elongated forms reproduced in black. 
The median test was employed for these com- 
parisons. The frequency of such “inconsistent” 
responses was significantly greater in the psychi- 
atric group than in the nonpsychiatric group 
(mdn = 3.41 and 1.96, respectively, x? = 7.20, 
df=1, p<.01). There was no significant dif- 
ference between male and female Ss with respect 
to frequency of “inconsistent” response; in fact, 
the medians were so nearly identical, 2.70 and 
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2.79, respectively, as to result in x2=.00 and 
p> .99. 


Discussion 


The results suggest that the psychoanalytic 
emphasis on the primary stimulus-generalization 
basis of sexual symbolism is probably incorrect 
insofar as it is taken to mean that the primary 
basis is in some sense “stronger” or superordinate 
with respect to other, mediated bases. No sig- 
nificant difference was found between the num- 
ber of responses consistent with the primary 
basis and the mediated basis. To make certain 
that this result was not due simply to a failure 
of both dimensions to elicit sexual symbolic 
response, the strength of each symbolic process 
was tested separately, with highly significant sym- 
bolic determination of response being shown in 
each case (p< .00001). Thus, both the primary 
and mediated bases of symbolism were verified 
and shown to be approximately equally effective 
in determining response. The implications of the 
results are limited by the fact that the medi- 
ated basis selected for study—the gray-black 
dichotomy—constitutes but one of many medi- 
ated dimensions which might also have been 
assessed, 

The present findings are in general agreement 
with those of Lessler, who found that the 
cultural referents of sexual symbolism (which 
involve mediated generalization) were frequently 
effective in determining response despite the 
simultaneous Presence of incongruent Freudian 
referents. Lessler’s data, however, appear to 
indicate that in such competing situations the 
cultural or non-Freudian referent is not simply 
effective but is dominant, presumably because 
it is“... socially acceptable, nonthreatening, 
and . . . consensually valid” (Lessler, 1964, p. 
46), The results of the present experiment, in 
contrast, show no evidence that either the medi- 
ated or primary basis of symbolism is dominant. 
The mediated basis, as it is represented in the 
present study, appears to differ from Lessler’s 
cultural referents principally in the absence of 
clearly established cultural stereotypes, or con- 
sensual validity, although the mediation presum- 
ably must be a cultural-linguistic product of 
some sort. Thus it appears that cultural referents 
are likely to be dominant only when the Ss have 
a clear understanding of their nature. 

The psychiatric Ss of the prior study (Jones, 
1956) made consistent sexual symbolic responses 


significantly less frequently than nonpsychiatrie 
Ss. In the present study this was partially veri. 
fied. Analysis of responses consistent with the 
mediated basis showed psychiatric Ss to respond 
at a marginally significantly (p< .06) lowep 
rate than nonpsychiatric Ss. Should this differ 
ence be reliable, it appears likely that a tas 
difficulty variable is involved, the mediated basis 
requiring a more subtle discrimination. Consider 
ing responses consistent with the primary basis 
however, no difference between the means 
obtained. Clearer evidence is provided by the 
analysis of responses to the buffer figures. The 
psychiatric Ss made significantly more responses 
(P<.01) which were inconsistent with bof 


highly personal, autistic symbol usage chara 
terizes severe personality disorder. 

The present study failed to replicate the earlier 
finding of a significantly higher rate of sexual 
symbolic response in male Ss than in female Ss. 
Although the reason is not immediately apparen 
two possibilities may be suggested. The task 
given the S in the present study is considerabl 
more complex, which may result directly in the 
minimization of sex differences. Alternatively, 
and perhaps more likely, the procedure of the 
first experiment, involving “primary” symbols 
only, and therefore being less disguised, may 
have elicited social inhibition of report to @ 
greater extent in female Ss than in males. In th 
Present study, the greater complexity of the task 
may have served to disguise its sexual implica 
tion more adequately, with the result tha 
the report of female Ss was less subject të 
differential social inhibition. 
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MMPI PROFILES OF PARISHIONERS SEEKING 
PASTORAL COUNSELING 


EDWIN E. WAGNER ann RICHARD D. DOBBINS 
University of Akron 


The MMPI was administered to 40 parishioners who had sought aid in 
pastoral counseling and 40 members of the same congregation who had not 


sought such aid. An attempt was made to control for sex, 


age, education, 


and income. Significant differences between these groups were found on 10 


out of 12 scales. Using the MMPI as the criterion, the pastor: 
group appeared to be more disturbed than th 
of psychological indexes. In view of these 


al counseling 
e control group across a variety 
findings, a question was raised 


concerning the adequacy of ministerial training. 


Authorities seem to agree that the emotionally 
disturbed often seek help from the clergyman 
(e.g., Gurin, Veroff, & Feld, 1960; McCann, 
1962; Steiner, 1945). However, a perusal of the 
literature reveals no quantitative psychological 
data which might permit an objective assessment 
of the nature and extent of the psychopathology 
among people seeking pastoral counseling, and it 
is impossible to state whether individuals seeking 
pastoral counseling are more disturbed than 
fellow parishioners who do not seek such aid. 
The present study attempts to investigate this 
basic question. 


METHOD 


The MMPI was administered by R. D. Dobbins, 
the minister of the Assembly of God, to every 
parishioner who sought aid through pastoral counsel- 
ing over a consecutive 10-month period. These 
people did not come for help because pastoral 
counseling was advertised or particularly emphasized 
in the congregation; they were merely parishioners 
who, of their own accord, went to their minister’s 
office for counsel or advice. In all, 40 subjects were 
tested, 13 males and 27 females. Their average age 
was 28.5, SD 10.0; average number of years of 
education was 11.8, SD 2.4; average estimated in- 
come per year was $7,043.75, SD $1,694.54. A con- 
trol group was selected from among parishioners of 
the same church who had not sought pastoral coun- 
seling. An effort was made to choose a group which 
was demographically similar to the experimental 
group. The mean age of the controls was 29.4, SD 
10.1; mean numbers of years of education was 11.9, 
SD 2.1; mean income per year, $7,6.8.75, SD 
$1,634.38, There were no significant differences be- 
tween the groups on any of these variables. The 
MMPI was administered to the control group in 
the standardized manner and under the same condi- 
tions used to test the experimental group. No proto- 
col was accepted for analysis if, on inspection, an 
average of more than 1 unanswered item occurred 
in every block of 15 items or if the Z score exceeded 
a raw score of 10. 
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It was postulated, according to the null hypothe- 
sis, that there would be no significant differences 
between these groups on the T scores of any of 
the 12 MMPI scales, 


RESULTS AND DISCUSSION 


As shown in Table 1, the null hypothesis was 
accepted for the Mf and Ma scales and rejected 
for all the others. 

Although this study was designed primarily to 
determine whether those seen in pastoral counsel- 
ing do actually exhibit greater evidence of emo- 
tional disturbance when compared to fellow 
parishioners who do not seek such aid, some 
indication of the variety of disturbances en- 
countered and their degree of seriousness may be 
gained by comparing the two groups on the 
number of subjects ranging 2 SD or more above 
the mean (Table 1). The experimental group 
contained 151 scores spread across all 12 scales 
which were 2 SD above the mean; the control 
group only exceeded this limit 25 times on 10 
of the scales, The psychological disturbances of 
those seen for pastoral counseling were not 
concentrated on any particular scale but were 
distributed across the entire spectrum of emo- 
tional disorders, This would tend to contradict 
the stereotype entertained by many lay and pro- 
fessional people that the clergyman tends to see 
the discouraged or depressed person. In fact, 
these findings suggest that people seeking pastoral 
counseling might not differ greatly from those 
seen in clinical practice by psychologists or 
psychiatrists. 

The emotional disturbances of the pastoral 
counseling group cannot be dismissed as being 
superficial. Hathaway and McKinley (1951) con- 
sider T scores above 80 to be present only in 
those profiles of subjects in extreme or even 
disabling emotional pain. Eighteen parishioners 
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TABLE 1 
MEANS, STANDARD DEVIATIONS, ¢ VALUES, PROBABILITY VALUES, AND FREQUENCY OF Scores 
BEYOND Two STANDARD DEVIATIONS ABOVE THE MEAN FOR THE MMPI Scores or 
SUBJECTS SEEKING PASTORAL COUNSELING (EXP; N = 40) anp Susyects Not 
SEEKING PAsToRAL Counsetinc (CON; N = 40) 
No, of scores 
greater than 
MMPI 2 SD above 
scales Group x SD t P the mean 
Exp. 59.33 9.46 7 
F Con. 50.50 4.69 5.07 0005 0 
Exp. 52.93 9.46 2 
K Con. 57.98 8.00 256 105 4 
Exp. 61.48 13.18 13 
HS Con. 50.80 727 445 .0005 0 
Exp. 66.38 14.38 19 
D Con. 51.55 8.57 ous 0005 2 
Exp. 67.60 12.22 17 
By. Con. 57.53 7.87 ana AS 4 
Exp. 73.55 11.52 26 
EA Con. 58.98 7.64 6.59 -0005 3 
Exp. 51.25 14.08 5 
BE Con. 53.00 9.11 65 ns i 
Exp. 64.88 13.96 14 
z4 Con. 52.23 8.97 5.27 .0005 2 
Exp. 66.85 11.81 17 
Ba: Con, 55.48 8.42 4.64 .0005 2 
Exp. 70.75 15.62 20 
oy Con. 54.45 7.50 411 -0005 2 
Exp. 56.00 10.39 3 
MA Con. 54.08 9.92 33 ns 3 
Exp. 59.10 11.68 8 
Sh Con. 52.53 8.05 2.85 01 A 


(45%) seeking pastoral counseling had T scores 
above 80. 

So great were the differences between the con- 
trol and experimental groups that it was found, 
post hoc, that a four-cell chi-square table, using 
a T score of 75 as an arbitrary cutoff point, 
would correctly classify 81.25% of the cases 
(x? = 32.8, p<.001, $ = .64). 

It is evident that the experimental subjects 
did, in fact, exhibit clear evidence of psycho- 
logical disturbance, and it would appear that they 
have emotional needs which probably cannot be 
met by simple advice giving or support. These 
findings must be limited to the population 
studied, but, if future research should corrobo- 
rate that similar conditions exist in other par- 


ishes, serious questions might be raised concern- 
ing the adequacy of ministerial training and the 
competency of the average clergyman to meet 
these demands. 
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ART AND SCIENCE IN PROJECTIVE TECHNIQUES 


CHARLES NEURINGER 
University of Kansas 


Many psychologists seem to feel that lacunae exist between the scientific and 
utilitarian status of projective techniques. They may either condemn projective 
techniques for their apparent lack of scientific validity or excessively praise 
them for their ability to generate artistic and intuitive meaning. The gap 
between science and art is not irreconcilable. Long ago, Leonardo da Vinci 
pointed out that sublime art could not occur unless the artist carefully and 
patiently studied the sciences that pertain to art. For Leonardo there was 
no immutable division between art and science, and he further felt that 
good science made for great art. Leonardo’s propositions about art and science 
can also be applied to the field of projective techniques. Careful precision, 
rigor, and specification (science) has, and can continue to lead to conclusions 
that are extremely useful in terms of the practical application of projective 


techniques (art). 


Psychologists who either use projective tech- 
niques in clinical practice, or have the impor- 
tant task of teaching graduate students to use 
them, often find themselves in a dilemma. Based 
on their personal experience and clinical skills, 
they often feel that projective techniques are 
extremely useful in the clinic for assessing cur- 
rent functioning, for predicting future behavior, 
and for developing and communicating a global 
and vivid description of personality. However, 
they tend to feel uncomfortable when confronted 
with experimental reports that fail to confirm 
their feelings about the validity of projective 
tests. Reviews of the validity status of projective 
techniques (Ainsworth, 1954; Little, 1959; 
Murstein, 1963; Rabin, 1951; Schneider, 1950; 
Zubin, 1954) may well shake the faith of those 
clinicians who have taken their university train- 
ing seriously. This dilemma has been succintly 
described by Lindzey (1961) under the general 
heading of “to quantify or not [p. 170].” Be- 
cause psychologists, like most human beings, 
need cognitive clarity, they have tried to re- 
solve the dilemma in a number of ways (Harris, 
1960; Hertz, 1951; Palmer, 1951; Shneidman, 
1962), with varying success. 

One of these attempts (Shneidman, 1962) is 
of particular importance because of the implica- 
tions made for projective-test-minded psycholo- 
gists, In his presidential address to the Society of 
Projective Techniques, Shneidman preferred to 
describe the dilemma as a controversy between 
nomothetic and idiographic methods in projec- 
tive testing. He utilized a very elegant frame of 
reference device for discussing the difficulties 
and problems facing the clinician when con- 
fronted by his antithetical feelings about sta- 
tistical and personal validity of projective tech- 
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niques. Shneidman organized his presentation 
around a set of ruminations which were associ- 
ated to the viewing of a medieval triptych. He 
conceived of one wing of the triptych as repre- 
senting the nomothetic view, and of the other 
wing as reflecting the idiographic approach. The 
crucial conclusion (represented by the center 
panel of the triptych) of his resolution of the 
idiographic-nomothetic controversy, is that the 
two views are not mutually exclusive and that 
projective minded psychologists need to take a 
position midway between them. However, Shneid- 
man then describes a midway position that is 
almost, but not quite, devoid of a nomothetic 
contribution. According to Shneidman, the psy- 
chologist who makes the proferred rapproche- 
ment between the idiographic and nomothetic 
methods in projective techniques 


. . . swears to investigate the individual, all the 
individual, but stops short of limiting himself to 
nothing but the individual, for he includes the 
individual in his natural habitat and in his pacific 
and stressful dyadic relationships. Further the per- 
sonologist uses or feels free to select among all the 
available psychological tests that meet his purposes 
of immediate observation and subsequent classifica- 
tion. He feels equally free to invent and construct 
new devices, instruments, techniques and procedures 
as the situation and terrain demand. He is concerned 
as much with creative analysis of conversations 
within the social hour as of consultations within the 
therapy hour. He searches out opportunities to en- 
gage in bold theoretical reflections as well as in 
rigorously sophisticated researches; he tries to direct 
himself away from the trivially clear and the clearly 
trivial [p. 383]. 

This reconciliation between the nomothetic 
and idiographic approaches, while being joyfully 
exuberant and optimistic, assigns a negligible 
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role to rigor and precision. The rapprochement 
seems to be a subtle affirmation of the idio- 
graphic wing of the triptych. The snubbing of 
the nomothetic is hardly calculated effectively to 
resolve the clinician’s dilemma. The implication 
is that the clinical psychologist should eschew 
the nomothetic method when dealing with pro- 
jective techniques because it stifles creativity or, 
to use another term, the “artistry” of the idio- 
graphic approach, 

Art, as exemplified by the triptych, is some- 
what related to projective devices in that it can 
arouse in different individuals, differing rumina- 
tions about the nomothetic (science ?) and idio- 
graphic (art ?) methods from the same stimulus 
material. If one is not stretching a point too far 
in order to make an analogy between science and 
the nomothetic method on one hand, and art and 
the idiographic approach on the other, then the 
triptych offers some interesting lessons as to the 
relationship between art and science. 

In order for a painter to master his craft, he 
must assiduously train himself in the sciences 
that pertain to his art. Every beginning psychol- 
ogy student is introduced to the cues used by 
painters to create the illusion of depth and dis- 
tance. Leonardo da Vinci (1883) called upon 
artists to study anatomy and physics intensively 
and to make numerous careful observations 
thereof because he considered them to be the 
basic sciences for painters. Leonardo’s Treatise 
on Painting (1956), written in 1472 and repre- 
senting his life’s work on the science of art, 
clearly stated that good art could not exist with- 
out precise scientific study. Leonardo was con- 
sidered to be a master at creating the illusion of 
momentary action, But this artistic achievement 
was not arrived at in any mystical or inspira- 
tional manner, After much experimentation with 
different anatomical and muscular positions, he 
derived a “law” about creating the illusion of 
momentary action. He states, that in order to 
achieve the particular effect of momentary action 
on canvas, the artist “must always make the 
figure so that the breast is not turned in the 
same direction as the head” (cf. Siren, 1916, p. 
198). In 1550, Vasari (1912) in his Lives hailed 
Leonardo as the master of shadow and light in 
art, This achievement was come by with much 
time and labor. In the Treatise, Leonardo ad- 
vises painters that the first step in learning how 
to use light and dark tones in order to achieve 
highlighting effects involves the observation, in 
nature, of the relationship between light and 
shadows and to study what is now known as the 
physics of luminous bodies. Depth perspective in 
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art, he says elsewhere in the Treatise, can only 
be learned by practiced use of the eye and 
analysis of the true and apparent shape of ob- 
jects. The explication of the aforementioned 
painter’s cues originated with the great Leonardo. 

For da Vinci (1956), there is no doubt that 
science underlies art, That their relationship was 
clearly understood by Leonardo is reflected in 
his discussion of the aims and goals of art. The 
purpose of art, he says, is twofold; it is (a) to 
create an illusion of the third dimension where 
none exists and (b) also at the same time to re- 
veal the feelings of the soul. Psychologists in- 
terested in projective techniques should take 
note, since the two aims also apply to their 
craft; one aim is to “know” personality, and the 
other goal refers to the science of producing the 
“knowing.” For Leonardo, there was no dichot- 
omy between art and science, just as for psy- 
chology there should be no dichotomy between 
the idiographic and nomothetic approaches. Pre- 
cision? does lead to relevance, and good nomo- 
thetics is essential to the production of sound 
idiographic statements. Lewin (1935) echoed the 
logical positivists when he pointed out that no 
general law was much good unless it subsumed 
and explained all individual cases. The obvious- 
ness of the conclusion that good nomothetics 
produces good idiography has been masked by 
the presence of bad nomothetics. Faulty science 


1There may be some confusion as to the usage 
of the term “precision.” As it is defined here, it 
stands for the careful study and manipulation of 
antecedent conditions that are to be linked to cef- 
tain outcomes. The same general procedure is uti- 
lized whether it is the scrupulous relating of studies 
and experimentations with different media and tech- 
niques to some artistic product that is satisfying to 
the painter, or to the relating of careful study and 
manipulation of projective media and techniques to 
valid conclusions about both personality in general 
and persons in particular. For many psychologists, 
“precision” has come to be equated with the use of 
exact mathematical formulations. The differences 
between the two meanings of the term are only 
superficial, and they are by no means mutually 
exclusive. Whether it is in the study of mathematical 
models of learning or the effect of early parental 
deprivation on TAT stories, the same paradigm of 
careful relating and manipulation of antecedent 
conditions to responses should be attempted. The 
language may be different, but the aims and modus 
operandi are the same. The aims and attitudes of 
the psychologist and artist are also similar. The 
artist carefully studies his craft so as to achieve an 
outcome that is relevant and meaningful. The scien- 
tist controls and manipulates his tools and theories 
so that the outcomes are also relevant and mean- 
ingful. 
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cannot produce good data which can then be 
correctly applied to individual persons. It is not 
surprising then that some clinical psychologists 
have become skeptical about experimental at- 
tempts to validate projective techniques. 

Perhaps some of the skepticism concerning the 
relationship between science and art in projec- 
tive techniques will be dissipated by the presen- 
tation of some instances where precision, rigor, 
and specification have proved to be relevant for 
the clinician in the clinic. 

The experimental attempts to specify the rela- 
tionship between behavioral expressions of ag- 
gression and fantasied aggressive responses ap- 
pearing on the Thematic Apperception Test 
(TAT) has been detailed by Lindzey (1961). 
Beginning with Murray’s (1943) introduction of 
the TAT, questions were raised as to the feasi- 
bility of concluding that individuals had com- 
mitted aggressive acts in the past, and/or pre- 
dicting such behavior in the future, if they pro- 
duced aggressive content on the test. Murray 
cautioned against assuming that fantasied ag- 
_ gression was isometrically related to aggressive 
acting-out behavior. Specification of the relation- 
ship is of great importance to the projective 
psychologist because he is often called upon to 
make judgments about future aggressive behav- 
ior. A succession of studies (Kagan, 1956; Les- 
ser, 1957; Mussen & Naylor, 1954; Sanford, Ad- 
kins, Miller, & Cobb, 1943), each adding greater 
specification, have done much to clarify the 
relationship between fantasied and behavioral 
aggression, Without traversing all the steps, it 
was found that fantasied and overt aggression 
were positively related on TAT cards that can 
evoke hostile fantasies in those hostile individuals 
whose parents were relatively accepting of ag- 
gression and negatively related in hostile persons 
whose parents were unable to accept aggression 
in their children. Lindzey, in his review of these 
studies, concluded that one of the advantages of 
such an approach with the TAT was for i 
the individual interested in the practical applica- 
tion of the instrument [p. 139].” = 

Greater precision in specifying the conditions 
producing differing manifestations of anxiety on 
the Rorschach test (Neuringer, 1962) leads to 
increased relevance for the clinician. A casual 
‘perusal of the theoretical and empirical Ror- 
schach anxiety indicator literature could easily 
lead a reader into concluding that the Rorschach 
_ was an unreliable instrument for detecting anxi- 
ety, since beside the theoretical formulations 
Varying a great deal, the results of empirical 
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studies were inconsistent and contradictory. 
However, a careful analysis of the conditions 
under which anxiety was generated and measured, 
the kinds of subjects used, and the manner in 
which the Rorschach was administered, revealed 
consistent patterns of anxiety manifestations, 
Neuringer reported that when undergraduates 
were used in either laboratory-induced anxiety 
situations, or where anxiety was measured by an 
inventory method, subjects became constricted 
because of greater vigilance. Individuals suffering 
from long-term anxiety associated with personal 
difficulties in adjustment tended also to con- 
strict, but became desensitized to their surround- 
ings and avoided close contact with large seg- 
ments of their environment. Neuringer was able 
to specify reliable differential Rorschach de- 
terminant patterns for both kinds of anxiety 
manifestations, 

In da Vinci’s (1954) Notebooks, sketches of 
the body stand side by side with geometric cal- 
culations of their anatomical dimensions, In 
projective psychology, art and science should 
also stand side by side. There is still much 
projective technique science to be studied. The 
psychophysics of projective technique stimuli is 
still unknown. Perceptual facts and theories have 
as yet not been meaningfully applied to projec- 
tive techniques. There has been no systematic 
correlation of response tendencies with either ex- 
perimentally varied or naturally varying moti- 
vational, emotional, or intellectual states of per- 
ceivers, These kinds of nomothetic studies need 
to be done if there is to be a renaissance of 
projective technique idiography. Better projec- 
tive technique science should produce better 
projective technique artistry, just as science 
underlay art in the mind of Leonardo da Vinci. 
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MEASURES OF HOMOSEXUALITY: 


CROSS-VALIDATION OF TWO MMPI SCALES AND IMPLICATIONS FOR USAGE? 


RICHARD R. FRIBERG 


University of Minnesota 


A cross-validation study is reported which supports the use of the MMPI 
Mf scale for evaluating homosexuality in groups within a psychiatric popula- 
tion and the usefulness of MMPI scale HSX as a measure of general sexual 
deviancy in such populations. However, the use of the scales as an aid in 
individual description and classification obtains no justification from the 
results of the study. In addition, high educational level was not found to 
be a significant factor in raising Mf and HSX scores within a homosexual 
group of patients, nor were educational level or intelligence found to lead 
to elevated scores on these scales among hospitalized male schizophrenics. 


The need for developing a scale with better 
discriminating capacity than the MMPI Mf 
scale in classifying male homosexuals and non- 


1This paper is based on portions of a doctoral 
dissertation at the University of Minnesota. The 
author wishes to thank Paul Meehl, William Scho- 
field, and Richard Holroyd for their encouragement 
and suggestions, Thanks are also due to Marijana 
Weiner and Bernard Weiner for assisting in the 
preparation of this paper. 


homosexuals was noted by Panton (1960). In 
response to this need he developed and intro- 
duced a new scale, the HSX scale. Data pre- 
sented by Goodstein (1954), Winfield (1953), 
and Levy, Southcombe, Cranor, and Freeman 
(1953) have demonstrated that an increase in 
male intelligence or educational level is re- 
flected in increasing Mf scores. Therefore, in de- 
riving the HSX scale Panton required items to 
discriminate not only a homosexual group from 
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a general nonhomosexual group, but also the 
homosexuals and superior IQ nonhomosexuals. 
Panton used prison inmates to derive his scale. 
The present study was undertaken to assess the 
relative merits of scales Mf and HSX in settings 
more typical of general MMPI usage. It was of 
special interest to discover whether the some- 
what more sophisticated derivation of the HSX 
scale, that is, controlling for intelligence, had in 
fact resulted in a more powerful scale for gen- 
eral application. The T scores of four groups of 
subjects on scales Mf and HSX were compared. 
It was felt that adequately validated scales 
should discriminate male homosexuals not only 
from a normal population, but also from a 
group of subjects with sexual deviations other 
than homosexuality. Otherwise, the scales might 
well be tapping variables indicative of abnormal 
sexuality in general rather than homosexuality 
specifically. Similar reasoning led to the inclusion 
of a general abnormal group. It is conceivable 
that the scales could discriminate abnormality 
in general rather than homosexuality. A group of 
normal subjects was included to cross-validate 
previous studies demonstrating that the Mf and 
HSX scales differentiate homosexuals from es- 
sentially normal subjects. 


METHOD 


Description of groups and selection procedures 
utilized are described below. 

1. Homosexuals (H), N=19. The files of the 
inpatient psychiatric service of the University of 
Minnesota Hospitals were searched for all patients 
with a discharge diagnosis of sexual deviation 
(homosexuality), pathological sexuality (homosexu- 
ality), or similar diagnosis, or where homosexuality 
was stated to be the primary reason for hospitaliza- 
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tion. Cases in which homosexuality was secondary 
to neurotic or psychotic conditions were eliminated. 
Six cases were eliminated from a preliminary analy- 
sis because they had been used in the original deriva- 
tion of scale Mf. These cases were retained for cross- 
validating scale HSX. 

2. Sexual Deviants (SD), N=16. This group 
was selected in a similar fashion but did not in- 
clude patients with known homosexual histories. 
Histories of sexual deviancy such as pedophilia, 
bestiality, rape, and incest characterized this group. 
The MMPI records of two patients were not com- 
plete but were complete for items on scale HSX: 
these cases were used for the cross-validation of the 
HSX scale only. 

3. General Abnormals (GA), N = 67. This group 
was formed by using every third patient in the 
group compiled by Rosen (1952) for use in deriv- 
ing MMPI scales in a psychiatric population. Ros- 
en’s original patient group consisted of 55% neu- 
rotics, 33% psychotics, and 12% behavior disorders. 

4. Normals (N), N=50. For this sample, the 
first 50 male cases were drawn from the normative 
sample used by Hathaway and Briggs (1957). 


RESULTS 


Table 1 presents a comparison of means and 
standard deviations on the Mf scale for the four 
groups. The data show marked differences in the 
expected direction between Group H and each of 
the other three samples, It is also evident that 
pathology of a more general nature can elevate 
Mf scores, as indicated by the significantly higher 
mean T scores for Groups SD and GA compared 
with Group N. 

Table 2 presents a comparison of means and 
standard deviations on the HSX scale for the 
four groups. While the HSX scale differentiates 
the homosexual group from Groups GA and N, 
the insignificant difference between Groups H 


TABLE 1 


Comparison OF MEANS AND ST: 
Homosexuat (H), Sexvat Deviant (SD), 


ANDARD DEVIATIONS FOR T-ScorE VALUES ON THE Mf SCALE FOR 
GENERAL ABNORMAL (GA), AND Normar (N) Groups 


H sD GA N 
N=19 N= 16 N=67 N=50 
70.47 57.88 57.12 48.34 
ce 15.06 7.07 11.72 8.86 
H-SD H-GA H : 5 H-N oa r 
M/Diff. t ratio M/Dif. Critical ratio M/Diff. Critical ratio 
12.59 2.95 13.35 3.56* 6.00" 
SD-N GAN 
SD-GA NSH 92 y LAEN 
i iti i iff. Critical ratio M/Difi. Critical ratio 
M/ st betae M, j i ri mye ips ee 


*p =< 01. 
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TABLE 2 


COMPARISON OF MEANS AND STANDARD DEVIATIONS FOR T-SCORE VALUES ON THE HSX SCALE FOR 
HOMOSEXUAL (H), Sexuat Deviant (SD), GENERAL ABNORMAL (GA), AND Normar (N) Groups 


H SD GA N 
N= 25 N= 18 N=67 | N = 50 
M 65.77 61.50 51.91 58.72 
SD 10.19 9.49 9.43 7.22 
H-SD H-GA H-N 
M/Difi. t ratio M/Diti. Critical ratio M /Diff. Critical ratio 
4.27 1.31 13.86 5.92 7.05 3.11* 
SD-GA SD-N GA-N 
M/Difi. Critical ratio M/Dif. Critical ratio M/Difi. Critical ratio 
9.69 3.8.* 2.78 1.13 6.81 4.45* 
*p = 5 0i. 7 


and SD suggests that the scale is probably tap- 
ping variables indicative of deviant sexual be- 
havior in general rather than homosexuality in 
particular, 

Another indication that the Mf and HSX 
Scales are not, measuring similar variables, de- 
spite similarities in construction, is evidenced by 
correlational data. Product-moment correlations 
were computed for the two scales in the samples 
studied. All of the correlations were negative 
and none were significant at beyond the .05 level 
of confidence, 


Use or CUTTING Scores ror INDIVIDUAL 
CLASSIFICATION 


Data were examined to assess the usefulness 
of cutting scores for individual classification. 
When scale Mf was used with a T score of 65 
as the cutting score, 59% of Group H members 
were correctly classified while 19% of the SD, 
25% of the GA, and 2% of the N subjects were 
misclassified. If a cutting score of 70 is used, 
the percentage of false positives in Groups SD, 
GA, and N is reduced but the percentage of 
true positives is only 42%. If scale HSX is used 
to classify sexual deviants the discrimination 
also is quite poor. Using a raw score of 13 as 
the cutting point, the false positive rates for 
Groups GA and N are 10% and 18% with a true 
positive rate of only 44%. Use of lower cutting 
scores led to highly inflated false positive rates 
with meagre gains in the indentification of true 
positives. 

Various combinations of cutting scores on the 
two scales were experimented with to see if im- 
proved discrimination of Group H from the other 
groups could be accomplished. The use of a T 


score of 60 on scale Mf and a raw score of 11 
on the HSX scale led to the best discrimination 
found using either scale separately or both in 
multiple cutting scores. This combination yielded 
a true positive rate of 64% with only 7% false 
positives for Group GA and 6% false positives 
for Group N. However, the false positive rate 
for Group SD was 17%, which is considered too 
high to support the use of these scales to diag- 


nose or classify individuals. If Bayes’ rule for | 


calculating inverse probability, as discussed in 
Meehl (1956), is applied to the best set of 
cutting scores described above, using 10% as the 


hypothetical percentage of homosexuals in a- 


psychiatric population (this stacks the odds in 
favor of the diagnostic tool), it is readily ap- 
parent that a positive assertion made on the 
basis of a positive test would be more likely- to 
be incorrect than correct, 


EFFECTS OF INTELLIGENCE AND EDUCATIONAL : 


Levet on Mf anv HSX Scores 


In view of Panton’s finding that high intellec- 
tual level led to increased Mf scores in his popu- 
lation, an attempt was made to assess the effects 
of this variable on the subjects’ scores in the 
present study. Although intelligence data were 
not available on enough of the subjects to per- 
mit an analysis, the educational level of the indi- 


vidual subjects in Group H were known. The , 


mean Mf T scores for Subgroups of the H group 
formed by dichotomizing on the basis of educa- 
tional level were 63.63 for subjects with educa- 
tions greater than high school and 80.88 for 
subjects with educations of high school or less. 
This difference was significant by ¢ test (p< 
.02). Contrary to expectation, there is a tend- 


_ e 
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ency for less educated subjects in this sample to 
obtain higher scores on scale Mf. An identical 
analysis of the data for the HSX scale revealed 
no significant difference in mean T score. Al- 
though the groups used as controls in the study 
(GA and N) were known to be comparable to 
the general population in educational level, the 
data on this variable were not available for the 
individual subjects at the time of the present 
study, However, since data were available for 
these variables for a large group of state hos- 
pital male schizophrenics, it was thought desir- 
able to subject the Mf and HSX scales to analy- 
sis with respect to the influence of these varia- 
bles in this group.* A comparison of means and 
standard deviations for the scales on groups 
* formed by dichotomizing on the basis of intelli- 
gence and education revealed no significant dif- 


| ferences, 


Y DISCUSSION: 


The results presented in Table 1 support the 
usage of the Mf scale for the study of homo- 
sexuality within a psychiatric population. Use of 
the scale as a criterion for group membership 
in conjunction with other criteria for selecting 
subjects for research groups would also seem 
appropriate. However, the cutting score data 
demonstrate that the scale is inapplicable for 
individual classification. Its usage for identifica- 

_ tion of this low base-rate condition would lead 
to more incorrect than correct assertions if posi- 
tive assertions were made on the basis of Mf 
score. Of particular interest is the finding that 
= not only sexual deviancy results in elevated Mf 
_ scores but that general psychiatric abnormality 
has a similar effect. This finding suggests that 
there may be common variables, for example, 
interest patterns, which differentiate the three 
patient groups from the normal population. 
, The results presented in Table 2 demonstrate 
` the inapplicability of the HSX scale for assess- 
ing homosexuality in psychiatric groups but lends 
some support to the use of the scale as a measure 
of general sexual deviancy. Although Panton’s 
data suggest that the scale is useful when identi- 
fying homosexuals in a prison population, the 
high scores obtained by the sexual deviants in 
„the present study indicate further research on 
the scale should be carried out to assess the false 


i 


2 Subjects for this analysis are described in detail 
in a thesis entitled “A Study of Homosexuality and 
Related Characteristics in Paranoid Schizophrenia” 
by Richard R. Friberg available from University 
Microfilms, Ann Arbor, Michigan. 
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positive rate for sexual deviants in a prison 
population. The elevated scores obtained by the 
normal group also caution against the use of the 
scale until more information is available. As 
with the Mf scale, the cutting score data demon- 
strate the inapplicability of this scale for indi- 
vidual classification. 

Data presented lend no support to the hy- 
pothesis that IQ or educational level are related 
to Mf or HSX T scores in a population of hos- 
pitalized male schizophrenics. High IQ or educa- 
tional level alone does not appear to lead to 
elevations on these scales. The relationships 
previously found in college populations and in a 
prison population might well be accounted for 
by some other characteristic, for example, in- 
terest patterns, which these high-intelligence sub- 
jects have in common rather than by intelligence 
or education per se. There is also the possibility 
that the effects of education and intelligence on 
the two scales vary from population to popula- 
tion or interact with the effects of other varis 
ables. 

The data presented caution against the usage 
of empirically derived scales on populations dis- 
similar to the one on which they were derived 
without study of their applicability to the new 
population. 
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It was hypothesized that 


correlated with shorter future time 
supporting the concept that both M 


A number of studies on time orientation have 
focused on its relationship to the concept of 
impulse control or the capacity to delay im- 
mediate gratification of needs. Barndt and 
Johnson (1955) and Davids, Kidder, and Reich 
(1962) found that delinquent boys have a shorter 
future time perspective than nondelinquents, 
Siegman (1961) further found that among delin- 
quents there was a positive correlation between 
the subjects’ future time perspectives and their 
Scores on a task of motor impulse control. 
Mischel and Metzner ( 1962) found that subjects 
preferring immediate reward tend-to have more 
variable future time perspectives. Ricks, Um- 
barger, and Mack (1964) studied adolescent de- 
linquent boys during the course of treatment and 
found that prospective time span increased with 
successful treatment, 

Interest has also been focused on the relation- 
ship between the concept of delayed gratification 
and the perception of moving human beings (M) 
on the Rorschach (Singer, 1955). Singer and 
Herman (1954) have shown that subjects who 
produce relatively large numbers of M manifest 
longer motor delaying capacity than low M sub- 
jects. On the basis of the above, it would be 
expected that time orientation would be related 
to perception of M on the Rorschach. Kurz 
(1963), following the above line of reasoning, 
studied the relationship between the perception 
of M and preferences for slow or rapid images 
of time as measured by the Time Metaphor 
Test. The results were considered supportive of 
the notion that M, as a measure of capacity 
for delayed need satisfaction, is inversely related 
to the level of motivation for the rapid passage 
of time. Kurz, Cohen, and Starzynski (1965) 
have further reported that some Rorschach scores 
(M and C) are related to the manner in which 
the passage of time is perceived when the sub- 
jects are asked to estimate time intervals, 

The present study attempted to investigate 
the relationship between. the delayed gratification 
aspect of M and time perspective of children. 
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: 


perception of M on the Rorschach Test is related 

to future time perspective as measured by a time-span technique. High M x 
Persons were found to have a longer future time perspective while low M was # 
perspective. The results were seen as [i 
and time perspective are related to the » ay 
ability to delay immediate gratification of needs. 


Using a time-span technique, it was hypothesized l 
that those subjects who produce relatively few 

M would manifest a shorter future time per- ¥ 
spective while those subjects with relatively large _ 
numbers of M would produce a longer future ~ 
time perspective. si 


METHOD 
Subjects 


The sübjēcts, -44 children, were of average intel- 
ligerice (90 to 110-IQ) as measured by the’ Pintner- 
Cunningham Intelligence Scale. They ranged in age 
from 8 years, 6 months to 11 years, 1 month and l 
were in the third and fourth grades of the New — 
York City Public School System. The children, 28 
boys and 16 girls, were white, English speaking, 
and born in New York City. They were all of the i 


same ethnic background all lived and went to 
school in the same stable, iddle-class neighborhood. ~ 
yN 


Methods of Measurement ih 

Time span. Time span was measured by a ‘projec- 
tive technique (cf, LeShan, i 52; Teahan, 1958; 

Wallace, 1956) designed to reflect time Perspec 
tive. Two stories were obtaii ed in response to. the 
instructions: ý 
EA 

Tell me a story. Just make up a story and tell- 
it to me. $ 


On completion of the story 


nd the ‘inquiry (see: 
below) instructions io ka NA 


continued: i 

ne 
Now I want you to tell me another story. This ` 
time TIl start one for you, and then you finish 
it any way you wish, I'll start it now. “At three” 
o’clock one afternoon, two boys were walking near * 
the outside of town... .” Now you finish the 
story. a ae 


Where a specific time interval was mentioned by 
the subject in telling the story, no inquiry was made. 
If no time interval was included, an inquiry was 
made at the end of the story to ascertain the time 
span of the action described, 

The stories. were assigned scores depending on the 
length of the time covered by the action of the 
story. The scoring system involved a categorization 


of the frequency distribution of time intervals. The 
scoring system is similar to one described by Barndt 
and Johnson (1955) and Davids and Parenti 
(1958). The scoring categories were as follows: 
(a) under 1 hour, (b) 1 hour to under 2 hours, 


` 12 hours, (e) 12 hours to under 1 week, (f) 1 week 
or more, Each story thus was scored 1 to 6. The 
correlation between the scores of the two stories was 
found to be significant (product moment correlation 
of 42, ~<.01). The scores for the two stories 
were averaged for each subject, and the average. 
score was used in all calculations, 


was administered individually and followed by an 
inquiry designed to clarify the manner in which 
, the subject perceived his response. The scoring of 
M was done according to Klopfer and Kelley 
(1946). The number of M responses perceived by 
_ each subject was the score used in ranking each 
‘subject for statistical analysis, 
Treatment of the data. The time span and Ror- 
schach protocols were coded and scored independ- 
t ently. The statistic used ‘as the Kendall rank cor- 
` relation coefficient (tau) which is described in 


Kendall (1955) and Siegel (1956, p: 218). * 


The data support the eG 

M individual has a longer ful 
while a low M es 

T time perspective. A si 
tion was found 


time perspective 
a shorter future 
positive correla- 
f ‘the time-span measure 
_ and M (tau=.28, P= .005). These results sub- 
œ stantiate past findings. (Kurz, 1963; Kurz, 
~ Cohen; & Stargynski, 1965) and lend support 
to the notion due high M person being able 
to delay immediate gratification of peeds, It 
would appear that in not being as concerned 
-about immediate gratification as the low M indi- 
; vidual, he is able to focus more into the future. 
The low M person, on the other hand, would 
appear to be relatively more concerned with the 
1 /Present as immediate gratification is of greater 
` importance to him, 
de t 
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SIMULATION OF NORMAL AND PSYCHOPATHIC MMPI 
“PERSONALITY PATTERNS 


RICHARD I. LANYON : em 


Rutgers University 


Previous work with the simulation of normalcy on personality tests has 
suggested that good adjustment involves an adequate understanding of socially 
approved behavior, 27 well-adjusted and 42 maladjusted college males took 
the MMPI under instructions to simulate very good adjustment, and again 
under instructions to simulate psychopathic personality. Both groups simulated 
very good adjustment satisfactorily; however, well-adjusted Ss were superior 
to maladjusted Ss in the simulation of psychopathic personality. The findings 
were consistent with the literature on role-taking and empathy, supporting 
the view that good adjustment involves an ability to understand and predict 
socially adequate and inadequate behavior. 


There is continuing controversy over the sig- 
nificance of the social desirability variable in 
structured personality inventories. Some psy- 
chologists (e.g., Heilbrun, 1964) have suggested 
that social desirability should be considered a 
valid index of personality adjustment, rather 
than a source of testing error or noninterpretable 
variance. This interpretation asserts that high 
scores on a social desirability measure under 
regular test-taking conditions are indicative of 
good psychological adjustment. 

_ There is recent evidence that adjustment may 
also be related to the ability to answer in a 
socially desirable direction on personality inven- 
tories when specifically asked to do so. Grayson 
and Olinger (1957) asked psychiatric patients, 
who had taken the Minnesota Multiphasic Per- 
sonality Inventory (MMPI) routinely, to take 
it again, answering “.. . the way a typical, 
well-adjusted person on the outside, would do,” 
Their findings suggested a relationship between 
favorability of prognosis and degree of improve- 
ment from the regular MMPI to the simulated 
one. However, the degree of normalcy indicated 
by the simulated MMPI was not in itself indica- 
tive of favorable prognosis. Canter (1963), com- 
paring the responses of well-functioning psychi- 
atric aides and two groups of alcoholics on the 
California Psychological Inventory (CPI), found 
a positive relationship between ability to present 
a good picture under “fake good” instructions, 
and a subject’s (S's) relative adjustment. Can- 
ter’s findings were interpreted to support Gough’s 
(1960) ‘theory of socialization, According to 
Gough, the socially inadequate person lacks the 


1 This research was supported in part by a grant 
from the Rutgers Research Council, Appreciation 
is expressed to Carol Vogel and Alice Merrill for 
their assistance in analyzing the data. 
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role-playing skills which enable him to identify 


with another’s point of view. a 
A somewhat similar framework for these find- 
ings is suggested by the specific literature on 


empathy and role-taking. Milgram (1960) has ` 


defined ‘taking as “an implicit, empathic 
process vy a person predicts the behavior 
in a given situation of another person or per- 
sons.” Studies in this area, summarized by Mil- 
gram, have co ntly shown a positive rela- 
tionship between empathic ability and psycho- 
logical adjustment. Thus, from the viewpoint of 
role-taking theory, 
should be better able than the maladjusted per- 
son to simulate all kinds of behavior, adequate 
or inadequate. There should be a positive rela- 
tionship between adjustment level and ability to 
simulate the personality test responses of a per- 
son possessing a specific kind of psychopathology, 
when information about the pathology is pro- 
vided. Sarbin, Taft, and Bailey (1960, p. 41) 


have referred to such a situation as a test of 


“imitative empathy.” 


In the present study, Ss were instructed to 


simulate “very good adjustment” (VGA) on the 


“MMPI. They were also instructed to simulate a 


specific kind of pathology, psychopathic person- 
ality. It was expected that well-adjusted Ss would 


be more successful than maladjusted Ss in both © 


simulations. 


METHOD 
Subjects 


The MMPI was initially administered to 482 male 
undergraduates in an introductory psychology course 
at a state university. Invalid profiles (L> 6, F > 
16, or K > 22) were discarded. Well-adjusted Ss 
were defined as those for whom T scores on MMPI 
scales Hs, D, Hy, Pd, Pa, Pt, and Sc were below 65, 
and T scores on scales Mf and Ma were below 70. 


the well-adjusted person 7 


af 


š 
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The higher scores allowed on scales Mf and Ma 
were in accordance with Goodstein’s (1954) report 
of MMPI norms for college students, Maladjusted 
Ss were defined as those for whom T scores on at 
least three of the above nine scales (including at 
least two other scales beside Mf and Ma) were 70 
or above. The definitions resulted in 92 Ss (19%) 
being called well adjusted, and 84 (17%) being 
called maladjusted. Of these, 37 well-adjusted Ss 
(Group W) and 42 maladjusted Ss (Group M) vol- 
unteered for the present study. The volunteers were 
students who were free from classes at the particular 
times the study was scheduled, and differed in no 
other apparent way from the students who did not 
volunteer, 


Procedure 


The study was carried out about 2 months after 
the initial administration of the MMPI. Subjects 
were not told why they, in particular, had been se- 
lected to participate. They were requested to take 
the MMPI again, imagining that they were newly 
graduating from college, were being assessed for a 
highly desirable job, and for this reason were trying 
to appear very well adjusted (VGA condition). Fol- 
lowing this administration and a 15- to 30-minute 
rest period, Ss were handed a mimeographed sheet 
containing a brief description of a psychopathic 
personality. The description was paraphrased from 
Hathaway and McKinley’s (19: description of 
high scorers on the Pd sale athe MMPI. The Ss 
were requested to take the once again, trying 
to answer the way they considered a psychopath 


would answer (PP condition). The description was 


initially read aloud to Ss, who were encouraged to 
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keep referring to their mimeographed sheets as they 
completed the inventory. 


RESULTS 
Simulation of Very Good Adjustment 


Groups W and M were compared in four dif- 
ferent ways on their ability to simulate VGA. 

1. Mean MMPI profiles were compared. In 
Table 1 are presented the mean MMPI scores 
for Group W and Group M under normal condi- 
tions of administration. Also presented are the 
mean scores for their simulated VGA profiles. In 
all cases, T scores were used for the clinical 
scales, and raw scores for the validity scales. 
Comparison of MMPIs under normal conditions 
showed that the scores of Group M were higher 
than Group W on every scale, the differences 
being significant beyond the .001 level for all 
but three scales. However, it can be seen that 
Group M differed little from Group W in ability 
to simulate VGA. Significant differences were 
shown only on two scales. Further, the VGA 
simulation profiles of both Group M and Group 
W differed little, except for the validity scales, 
from the mean profile of Group W under normal 
conditions. 

2. The simulated VGA profiles of Groups W 
and M were individually examined for their 
ability to satisfy a specific criterion of VGA. 
The criterion chosen was the one used in the 
original selection of well-adjusted Ss—T scores 


TABLE 1 


MEAN MMPI Prorttes ror Wett-Apyustep (Grour W: N = 37) anp Matapyustep (Group M: 
N = 42) STUDENTS UNDER NORMAL CONDITIONS AND WHEN SIMULATING 
Very GOOD ADJUSTMENT 


MMPI simulation of very 
MMPI under normal conditions good adjustment 

Scale Group W Group M t Group W Group M t 

$ Š 1.23 6.73 6.90 .25 
F 378 7a cos 189 2381.32 
Ke 15.41 13.56 2.34* 20.38 19.81 80 
Hs 49.87 59.44 5.68*** 50.56 51.00 30 
D 51.36 71,24 9.33*** 48.28 50.00 1.14 
Hy 54.10 64.67 6.74%** 57.28 57.33 05 
Pd 53.92 67.59 6.99*** 53.73 35.23 ple Sia 
Mf 58.04 73.32 8,99%** 55.82 61.95 3.38 
Pa 47.65 62.41 8.61*** 52.03 51.60 28 i 
Pt 53.40 76.08 14.53*** 50.06 54.39 3.13 
Sc 53:94 74.92 14.00*** 52.54 53.41 64 
Ma 56.63 63.47 3.35" 55.09 56.88 1,09 

a Raw scores, 


* Significant beyond the .05 level. 
= aoe Significant beyond the .01 level. 
Significant beyond the ,001 level. 
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on all clinical scales below 65 except for the 
Mf and Ma scales, for which the limit was raised 
to 70. All profiles were considered in this com- 
parison regardless of whether they met the usual 
validity requirements. The criterion of VGA was 
successfully met by 31/37 Ss from Group W, 
and by 28/42 from Group M. These proportions 
are not significantly different (x? = 3.04, p< 
.10). It should be noted that the use of the same 
rules for the criterion of good adjustment and of 
successful simulation of VGA spuriously in- 
creases the probability of obtaining a significant 
difference. 

3. Groups W and M were compared on their 
ability to produce a “valid” profile under the 
simulated VGA condition, without regard for 
the accuracy of simulation. The validity require- 
ments used were those employed previously (L 
<6, F<16, and K<22.) Here, 13/37 Ss in 
Group W and 12/42 Ss in Group M produced 
valid profiles. These proportions do not differ 
significantly (x? = .39, p <.70). 

4. Groups W and M were compared on their 
ability to produce simulated VGA profiles which 
were valid and which also satisfied the criterion 
for successful simulation, This was done by ex- 
amining the valid profiles for successful simula- 
tion. There were 11/13 in Group W and 10/17 
in Group M which met the criterion. Applica- 
tion of Fisher’s exact probability test shows 
these proportions not to differ (p <.15). 


Simulation of Psychopathic Personality 


Because some Ss produced profiles of simu- 
lated PP which contained unusually high scores, 
it was not considered meaningful to compare the 
mean profiles of Groups W and M in the PP 
condition. Several other comparisons were made, 

1. Groups W and M were compared on their 
ability to produce a “valid” simulated PP pro- 
file. In choosing the appropriate validity require- 
ments, Hathaway and Monachesi’s (1963, p. 36) 
work with delinquency and the MMPI was fol- 
lowed. The limits L<6 and K<22 were re- 
tained, while the limit for validity on the F scale 
was raised to <21. In Group W 16/37 profiles 
were valid; in Group M, 12/42. These propor- 
tions do not differ significantly (x? = 1.85, p< 
.20). 

2. Four different criteria of successful simula- 
tion of PP were chosen as representing the most 
common MMPI signs associated with PP. In 
order of stringency, they are: Pd the highest 
scale; Pd and Ma the highest scales, in either 
order; Pd one of the two highest scales; and Pd, 
Sc, and Ma the highest scales, in any order. Ties 
were permitted in all criteria. Using Fisher’s ex- 


act probability test in comparisons of valid 
profiles, simulation of PP was found to be more 


successful in Group W than in Group M for _ 
every comparison. Probabilities, in the same | 
order as the criteria, were .070, .001, .013, and i 


008, 


Discussion 


The results indicated that maladjusted Ss dif- 
fered little from well-adjusted Ss in their ability 
to simulate VGA on the MMPI. However, well- 
adjusted Ss were considerably more successful 
in simulating PP, 

One possible reason to question the validity of 
the results is that the MMPI was used as the 
source of both the independent and the depend- 


ent variables. The criterion groups (W and M) | 


were initially chosen on the basis of their re- 
sponses to the MMPI under regular conditions; 
differences were then sought between these 
groups when answering the MMPI under re- 


sponse-bias conditions. It might be considered | 
that, under such circumstances, the likelihood of ` 


finding the anticipated differences would be 
spuriously heightened. However, it is unlikely 
that the results of the present study were seri- 
ously affected by this aspect of the experimental 
design, since the groups differed only minimally 
in their simulation of VGA. Additionally, it 
might be noted that the kind of pathology which 
Ss had to simulate was not the kind manifested 
by Group M. Examination of Table 1 shows that 


the mean MMPI profile under normal conditions — 


of Group M Ss tended to reflect anxious and ob- 
sessive characteristics, and was somewhat differ- 
ent from the expected mean profile for psycho- 
pathic personality. Nevertheless, it is acknowl- 
edged that the possibility of contamination does 
exist when the present method of criterion se- 
lection is used and that this potential difficulty 
could be avoided by selecting criterion groups 
on a completely independent basis.2 

It is surprising, in the light of Canter’s (1963) 
findings, that the maladjusted Ss simulated VGA 
as successfully as they did. One explanation might 
be that since they were college students and 
presumably functioning satisfactorily, their de- 
gree of maladjustment was mild. If this was the 
case, then the second finding, that the two groups 
were clearly disparate in ability to simulate 
psychopathic personality, is particularly note- 


2 It is also possible that there was some difference 
between Groups W and M in ability for sustained 
concentration. Such a difference may have con- 
tributed to the results for the simulated PP task, 
even though the tasks were separated by a rest 
period. 
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worthy. An even greater discrepancy might be 
expected with a maladjusted group consisting of 
patients rather than college students. The fact 
that some Ss simulated a particularly severe 
brand of PP also deserves comment. The in- 
structions apparently had the effect of inviting 
these Ss to answer in as bizarre a manner as 
possible. It might have been preferable to use 
a behavioral description which placed more 
emphasis on the ability of a psychopath to func- 
tion adequately in society for considerable pe- 
riods of time, in spite of his pathology. 

In the present study, simulation of desirable 
and undesirable personality characteristics was 
investigated. The results are seen as contributing 
to the clarification of findings from previous 
studies of the simulation of specific personality 
patterns. Such studies are usefully interpreted in 
terms of differences between well-adjusted and 
maladjusted Ss in role-taking ability. The well- 
adjusted S is seen to be more flexible than the 
maladjusted S in ability to empathize with, and 
to predict the behavior of, people at both ex- 
tremes of social adequacy. The findings of the 
present study also indirectly suggest the possi- 
bility of interpreting social-desirability scores 
(under regular test-taking conditions) as indexes 
not of adjustment level, but of role-taking abil- 
ity. 

It would seem fruitful to extend the present 
avenue of inquiry to the simulation of other 
kinds of pathology and to other subject groups. 
In particular, it would seem desirable to com- 


pare psychiatric patient populations with well- 
functioning normals. š 
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A NOTE ON “PROGNOSTIC SCALES IN SCHIZOPHRENIA” 
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An article on prognostic scales in schizophrenia by Garfield and Sundland is 
criticized and several ways in which it is misleading are pointed out, 


In an article appearing in the Journal of 
Consulting Psychology (Garfield & Sundland, 
1966), a report by Farina, Garmezy, Zalusky, 
and Becker (1962) was criticized for arriving 
at a faulty conclusion. The conclusion is quoted 
from the Farina et al. study as follows, “that 
adequate interpersonal adjustment was prognos- 
tically more significant than marital status for 
this group of female patients [Garfield & Sund- 
land, 1966, p. 23].” They state that the con- 
clusion to which they object is based on a 
regression analysis done by Farina et al. In that 
analysis, the relative association of 13 variables 
to a measure indicative of recovery from psycho- 
sis was determined. The results showed that ade- 
quacy of adjustment during the premorbid period 
and recovery were more closely associated than 
marital status and recovery. 

The Garfield and Sundland article is mislead- 
ing in a number of ways. They point out, quite 
correctly, that the premorbid scale and marital 
status are correlated, and once the variance ac- 
counted for by the premorbid scale was removed, 
marital status could account for little of the 
remaining variance. But, in so doing, they rather 
clearly imply that Farina et al. did not recog- 
nize this fact, or else they would not have 
arrived at the conclusion they reached. In fact, 
the marital-status-premorbid-adjustment associa- 
tion is explicitly recognized several times in the 
Farina et al. study and a correlation of .74 
between these two variables is reported. More- 
over, only a segment of the conclusion is re- 
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ported by Garfield and Sundland. The entire 
sentence reads as follows: “Tentative evidence 
was obtained indicating that adequate inter- 
personal adjustment was prognostically more sig- 
nificant than marital status for this group of 
female patients [Farina et al., 1962, p. 60].” 
The results of the regression analysis in question 
appear sufficient to justify this conclusion. How- 
ever, the conclusion, which appears in the Sum- 
mary section of the article, is based on more 
than this. We showed in the report that even 
when married patients only are considered, the 
more adequate and extensive the premorbid in- 
terpersonal relationships, the more likely was the 
patient to recover (x?=5.63, df =1, p <.05). 
Also misleading, in view of this demonstration, 
is a question with which Garfield and Sundland 
conclude their article, They ask, “Do they (the 
Prognostic scales) merely reflect whatever is re- 
flected by marital status, or do they contribute 
something unique [Garfield & Sundland, 1966, p. 
23]?” The Farina et al. study clearly shows that 
Prognostic scales do reflect something unique. 
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A FURTHER NOTE ON “PROGNOSTIC SCALES IN SCHIZOPHRENIA” 


SOL L. GARFIELD 
Teachers College, Columbia University 


Marital status and premorbid personality are discussed as prognostic indicators 


of improvement in schizophrenia. 


The point at issue between Dr. Farina and 
that of Dr. Sundland and myself basically con- 
cerns the matter of interpreting data. While 
such a matter, perhaps, cannot ever be resolved 
fully, I will try to explain the the reasons for our 
interpretation. 

The issue is whether “premorbid adjustment” 
predicts therapeutic or hospital outcome better 
than marital status. We took exception to the 
conclusion reached by Farina, Garmezy, Zalusky 
and Becker (1962) “that adequate interpersonal 
adjustment was prognostically more significant 
than marital status,” and to the regression analy- 
sis upon which it appeared to be based. In their 
analysis, the variable which clearly accounts for 
most of the variance was subscale C of the 
Phillips Scale. In this scale, marital status is 
clearly emphasized, particularly for subjects 30 
years of age and above. This finding appears to 
support, even in a contaminated way, the impor- 
tance of marital status as a predictive variable. 
Having extracted this variable, which includes 
marital status as an important component, the 
authors then found that marital status alone 
ranked only eleventh in their analysis of 13 
variables and contributed only a small percentage 
of the additional variance. 

It is here that our interpretations diverge. 
After extracting subscale C, marital status alone 
could not be meaningfully appraised in terms 
of this analysis. The significance of marital status 
as a predictor variable is grossly underrated by 
this particular analysis, yet its obvious impor- 
tance is clearly attested to by Farina et al. who 
Teport it as being significantly related to re- 
covery. In addition, a correlation of .74 was 
found in their study between marital status and 


total premorbid personality. The effect of the 
method of analysis used on the possible conclu- 
sions to be drawn is further illustrated by com- 
paring the variables, education and marital status. 
Education is ranked second in order of impor- 
tance in terms of the regression analysis, but it 
was not found to be significantly related to re- 
covery in a direct analysis of these two variables. 
Marital status, however, which is ranked eleventh 
by means of the regression analysis, is clearly 
related to recovery, and the data from both of 
our studies agree in this regard. 

I would also like to state that we did not 
intend to mislead the reader or in any way to 
imply that Farina et al. were not aware of the 
close correlation between marital status and 
premorbid adjustment. In fact, we specifically 
stated that “they also found that married pa- 
tients recovered at a significantly higher rate 
than” unmarried patients, One cannot quote at 
length in a brief article, and our partial quote 
was solely for purposes of economy. For the 
same reason, I will respond only briefly to 
Farina’s comment that the question with which 
Sundland and I concluded our article was mis- 
leading, The question to which reference is made 
did not conclude the article. In the same para- 
graph from which Farina quotes will be found 
some answer to our own question. One should 
conclude the paragraph in order to get the 
complete point-of-view. 

Finally, it is worth emphasizing again that 
both Farina et al. and Garfield and Sundland 
pointed to the need for further research to 
clarify these matters. 
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MMPI PROFILES OF MONOZYGOTIC AND DIZYGOTIC 
TWIN PAIRS! 


MARVIN REZNIKOFF 


Institute of Living 


In an effort to obtain more quantifiable and 
objective data on the potential contribution of 
genetic factors to personality characteristics, 
Gottesman (1963) employed psychological tests 
in assessing pairs of fraternal and identical high 
school age twins. The present study represented 
a further attempt at more precise evaluation 
of personality parameters in twins, this time 
examining an adult population. 

The procedure entailed administration of an 
IBM card form of the MMPI to 18 mono- 
zygotic (MZ) and 16 like-sexed dizygotic (DZ) 
twin pairs. The mean age of the groups were 
MZ twins 40.7 years and DZ twins 39.0 years, 
with ranges of 18-66 years and 19-61 years, 
respectively. These subjects were a random 
sample drawn from 1,100 pairs of twins living 
in Connecticut. The zygocity of the twins was 
established through extensive blood typing. 

Each subject’s raw scores on 3 validity and 
11 clinical MMPI scales were converted to T 
scores. After it had been ascertained that sex 
differences in scores within the two groups were 
negligible, male and female subjects were com- 
bined, and intraclass correlations were computed 
for the total MZ and DZ groups on each of 
the test scales. To facilitate comparisons with 
Gottesman’s adolescent sample, heritability in- 
dexes (H) were also calculated. Gottesman de- 
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fines H as “the proportion of the total trait 
variance associated with genetic factors.” 

The intraclass correlations for 9 of the 14 
MMPI scales were significantly different from 
zero for the MZ twins, while only 2 reached 
significance in the DZ group. The MZ intraclass 
correlations were larger than the corresponding 
correlations for the DZ twins on all but three 
scales—K, Pd, and Es. While Gottesman found 
significant H’s on the D, Pd, and Si scales, the 
H’s that attained significance in the current study 
were on the Hs and Mf scales. 

The relatively small numbers in each twin 
group would certainly make any interpretation 
of the data that is offered very tentative. How- 
ever, the overall results confirm Gottesman’s im- 
pression that genetic influences appear to be 
operant in some aspects of personality, at least 
as measured by a self-descriptive personality in- 
ventory. They also substantiate his conjecture 
that rate of development would probably make 
the extrapolation of genetic findings from an 
adolescent to an adult population quite spurious. 

Gottesman interpreted his data on adolescent 
twins as supporting the presence of genetic fac- 
tors in psychotic disorders, and indicated that 
hysterical and hypochondrical neurotic states 
have, at best, a minimal genetic basis. Directly 
at variance with this, the current adult twin 
data revealed significant genetic components on 
the MMPI neurotic scales and no manifest 
genetic aspects on the scales most closely associ- 
ated with psychosis. The sources of this disparity 
are presently unclear. A limitation of the re- 
search may have been the sole use of a self- 
descriptive inventory. It is suggested that pro- 
jective instruments could elicit perhaps more 
basic genetic variables. 
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PERCEIVED EMPATHY, INTERVIEWER BEHAVIOR, 
AND INTERVIEWEE ANXIETY? 


WILLIAM D. PIERCE anp DONALD L. MOSHER 
The Ohio State University 


This study is a psychotherapy analogue which 
studied perceived empathy as a function of the 
interviewees’ anxiety about personal interviews 
and the interviewer’s timing of his remarks. 

A median split of the scores of 60 male sub- 
jects (Ss) on a General Anxiety Questionnaire 
was used to compose four groups by assigning 
Ss to the appropriate and inappropriate interview 
conditions, The appropriateness of the timing of 
the experimenter’s (Z’s) remarks was varied by 
introducing interruptions and silences into the 
inappropriate condition, whereas Æ’s remarks 
were appropriately timed in the appropriate con- 
dition. In both interview conditions, the inter- 
viewer’s verbal responses were limited to a reper- 
toire of 14 nondirective remarks which were 
selected to fit the context of the interview and 
which were spoken with a warm and empathic 
vocal quality, During the 15-minute appropriate 
interview condition, Æ never permitted 5 seconds 
of silence to elapse without making a comment 
nor did he interrupt S while he was speaking. 
The first 5 minutes of the inappropriate inter- 
view was conducted identically to the appropriate 
interview, but during the second 5 minutes Æ 
interrupted S with a remark each time S spoke 
for 3 seconds, and during the final 5 minutes 
E did not respond until 15 seconds of silence 
had elapsed. The inappropriate interview condi- 
tion was adapted from the standard interview 
procedure of Matarazzo (1962). Following the 
interview, Ss completed the Post-Interview Anx- 
iety Questionnaire and the Barrett-Lennard Per- 
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ceived Empathy Questionnaire, and a post- 
experimental inquiry was conducted. 

Interviews were randomly tape-recorded. 
Seventy excerpted statements were submitted to 
two groups of four judges to check to see if the 
vocal quality was constant in both interview 
conditions; it was. 

The postexperimental inquiry revealed that Ss 
in the inappropriate interviews noticed the si- 
lences and interruptions (x? = 3.94; df = 1; p< 
.05). They also felt more uneasy during the last 
two sections of the interview, while Ss in the ap- 
propriate interview felt most uneasy initially 
(x? =16.98; df =2; p<.001). 

A treatment by levels analysis of variance 
was computed using the perceived-empathy 
scores as the dependent variable which con- 
firmed the hypothesis (F= 9.68; df=1, 56; 
p<.01) that Ss in the appropriate interview 
condition perceived Z as more empathic than 
did Ss who were interrupted and left in silence, 
This analysis did not confirm the hypothesis 
(F =.89) that Ss who are more anxious about 
interviews in general would perceive Æ as less 
empathic. The perceived-empathy scores were 
inversely correlated with the post-interview-anx- 
iety scores in both the appropriate (r=—.51; 
df= 28; p<.01) and inappropriate (r = —.62; 
df =28; p<.01) conditions. 

Matarazzo (1962, p. 476) reports that experi- 
enced psychiatrists and psychologists are unable 
to distinguish their standardized interviews from 
typical initial interviews. The present study sug- 
gests that the interviewees may perceive the 
interviewer as less empathic and feel more un- 
easy during the silence and interruption phases 
of such interviews. 
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A TEST OF A BODY-IMAGE HYPOTHESIS OF HAND EXTINCTION 
IN DOUBLE SIMULTANEOUS STIMULATION + 


FREDA MORRIS 
Kankakee State Hospital, Kankakee, Illinois 


When normal subjects (Ss) with closed eyes 
are touched simultaneously on the face and 
hand, about 50% report both stimuli, The oth- 
ers do not report the touch on the hand, from 
1 to 20 or more trials. This test is called 
double simultaneous stimulation (DSS) and 
hand extinction is the failure to report the touch 
on the hand, 

Bender (Bender, Green, & Fink, 1954) stated 
in his last publication on the subject that no neu- 
rological explanation was feasible and presented 
a psychological explanation. Linn (1955) criti- 
cized Bender’s proposal and suggested that the 
phenomenon might be related to the perceptual 
deficit believed to occur in very young infants in 
which they are unable to differentiate between 
their face and the mother’s breast, and that the 
hand enters into this percept as a substitute for 
the mother’s breast. Thus, hand-face fusion oc- 
curs. From this theorizing, the authors deduced 
that groups assumed to be oral in character 
structure should make more errors than control 
groups. 

Two oral and two control groups were used to 
test this hypothesis, The oral groups were 20 
alcoholics (AA members) and 20 hospitalized 
ulcer patients. The control groups were 30 nor- 
mals and 15 hospitalized rheumatoid arthritis 
and neurodermatitis patients, The latter group 
was chosen as a psychosomatic control whose 
symptoms are not believed to have arisen as a 
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result of oral conflict. The groups were com- 
pared by use of chi-square technique with the 
frequency distributions dichotomized according 
to the number of Ss who made errors versus 
those who made none. The combined experi- 
mental groups had significantly more Ss making 
errors than did the combined control groups. The 
ulcer group differed significantly from the normal 
group in the predicted direction. None of the 
other comparisons were significant. The results 
Support the hypothesis, though not strongly, since 
the patient control group alone is not signifi- 
cantly different from the experimental groups. 
The dependency feelings which may have been 
aroused by hospitalization in the ulcer and pa- 
tient control groups may have encouraged re- 
gression and thus augmented a tendency toward 
DSS errors, The security of AA membership 
may have decreased regressive tendencies in the 
alcoholic group and thus decreased DSS errors, 
Thus, a lack of control over environmental con- 
ditions which encourage or discourage regression 
may have obscured the relationship between 
orality and a tendency to make DSS errors. The 
idea of hand-face fusion has received support but 
just what personality variables, aside from patho- 
logical orality, may be associated with it are un- 
known. An empirical study relating DSS errors 
and a number of personality factors could be 
profitable, Anxiety and oral regressive tendencies 
should be checked. Regression in the service of 
the ego should be differentiated from maladap- 
tive regression, 


REFERENCES 


BENDER, M., Green, M., & Finx, M. Patterns of per- 
ceptual organization with simultaneous stimuli. 
Archives of Neurology and Psychiatry, 1954, 72, 
233-244, 

Lin, L. Some developmental aspects of body image. 
International Journal of Psycho-Analysis, 1955, 
35, 35-42. 


(Received August 24, 1965) 


Journal of Consulting Psychology 
1967, Vol. 31, No. 1, 103 


VALIDITIES OF SHO 
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Several short forms of the Wechsler Adult 
Intelligence Scale based upon several different 
rationales have been developed in recent years 
(Doppelt, 1956; Jones, 1962; Maxwell, 1957). 
Typically, the short forms were developed using 
the standardization data (Wechsler, 1955), but 
only recently have they been widely cross-vali- 
dated—almost always on psychiatric populations. 
The Doppelt forms have been the subject of 
five analyses (Clayton & Payne, 1959; Himel- 
stein, 1957; Jones, 1962; Sterne, 1957), while the 
validity of some of the Maxwell batteries has been 
the subject of one investigation (Howard, 1959). 
The present study presents results of a cross- 
validation of 49 different short forms developed 
by the three researchers cited above. Unlike 
previous studies, the short forms are cross- 
validated on both a psychiatric and on a non- 
psychiatric population, 

Several kinds of analyses were done on the 
estimated Full Scale scores: (a) they were com- 
pared with actual Full Scale scores, using ¢ tests; 
(b) they were correlated with actual Full Scale 
Scores; and (c) they were studied for their 
accuracy of classification (i.e., the percentage of 
time that they erroneously placed an individual 
in a higher or in a lower IQ classification, e.g., 
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bright, normal, average, borderline, etc., than 
was revealed by his scores on the full battery). 
The results of these analyses revealed that for 
both the psychiatric and the nonpsychiatric 
populations (a) the mean estimated and the mean 
Full Scale scores were significantly different in 
only 3 of 98 comparisons; (b) correlations be- 
tween estimated and obtained Full Scale scores 
were in the high .80’s or .90’s; (c) in the great 
majority of instances subjects were placed within 
one classification of their intelligence classifica- 
tion as revealed by scores on the total test. 
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PRODUCTION OF ASSOCIATIVE SEQUENCES IN PROCESS-REACTIVE 
SCHIZOPHRENIC AND NONSCHIZOPHRENIC GROUPS 
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A previous study (Judson & Katahn, 1960) 
disclosed significant differences between process- 
reactive schizophrenics in the recall of friends’ 
names over a 10-minute interval. The differences 
were greater than would have been expected 
from their recall of animal names and IQ scores. 
This was interpreted as reflecting a special re- 
striction in interpersonal relationships in a 
generally impoverished relationship with the 
environment. No differences in associative inter- 
ference or efficiency were found using Bousfield’s 
method of analysis, with which Lester (1960) 
had previously detected differences amongst 
various diagnostic groups. 

The present study sought to extend the find- 
ings and employed both schizophrenic and non- 
schizophrenic patients. A total of 52 male sub- 
jects were rated on the Phillips scale for good 
and poor premorbid adjustment. In addition to 
the free recall of animal and friends’ names, 
subjects also attempted to learn a new list of 
60 words which contained animal names, per- 
sons’ first names, vegetables, and professions. 
According to Lester, a learning-recall task is 
more sensitive to the presence of associative 
interference than the free-recall tasks. 

Both the process-reactive dimension and diag- 
nostic category made significant independent and 
interacting contributions to the recall of friends’ 
names, that is, the material with social connota- 
tions, but not to the recall of animal names. By 
subgroups, the rank order of recall from least 
to greatest, was process schizophrenics, process 
nonschizophrenics, reactive schizophrenics, reac- 
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tive nonschizophrenics, The process-reactive dis- 
tinction thus proved meaningful for nonschizo- 
phrenic as well as schizophrenic patients on this 
material. While schizophrenics were predictably 
inferior to nonschizophrenics in new learning, 
process patients unexpectedly tended to be super- 
ior to reactive patients within each diagnostic 
category. The greatest deficit in both recall and 
new learning of material of an interpersonal 
nature was noted in the process-schizophrenic 
group. The Bousfield method, specifically de- 
signed to measure associative interference, failed 
to detect differences, while standard analyses of 
variance of the raw data yielded significant inter- 
actions indicative of more efficient production in 
the nonschizophrenic group following new learn- 
ing. Since two experimenters were controlled as 
variables in the study, with no interactions in- 
volving any of the above findings, the results 
are considered to have been essentially replicated 
within the same experiment. 

It is concluded that the use of free recall and 
new learning of socially relevant material may 
have some prognostic utility when combined 
with the Phillips scale, With respect to new 
learning, a deficit appears likely in schizophrenic 
patients regardless of the naure of the material. 
On the other hand, except for material with 
social connotations, process patients appear to be 
as capable and as motivated to learn as reactive 
patients. Failure to detect associative interfer- 
ence with the Bousfield technique may be related 
to flaws in the curve fitting procedures. 
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INTROVERSION, NEUROTICISM, RIGIDITY, AND DOGMATISM 
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There are two problems concerning the often 
hypothesized relationship between neuroticism 
and behavioral rigidity. First, are there person- 
ality dimensions other than neuroticism which 
affect the relationship? Eysenck (1947) hypothe- 
sized that introverted neurotics are rigid, but 
that extroverted neurotics are not. Second, Rok- 
each (1960) offered a distinction between rigid- 
ity and dogmatism, in which rigidity can be 
defined as the inability to produce novel or 
changed responses while dogmatism can be de- 
fined as an inability to utilize novel responses 
which have been produced. Which of these two 
types of inflexibility characterizes neurotics? 

To explore for answers for these two ques- 
tions, a 2 X 2 experimental design was set up. 
The Maudsley Personality Inventory (Eysenck, 
1959) was administered to 194 students. The 
scale gives two scores, one for neuroticism and 
one for introversion-extroversion. Four ex- 
treme groups were isolated: High Neuroticism- 
Introvert (HN-I), High Neuroticism-Extrovert 
(HN-E), Low Neuroticism-Introvert (LN-I), 
and Low Neuroticism-Extrovert (LN-E). These 
four groups (N = 10 for each) constituted the 
groups in the study. The subjects were run one 
at a time, their task being the Doodlebug 
problem invented by Rokeach (1960). In order 
to solve the Doodlebug problem, the subject 
Must overcome several sets of beliefs and utilize 
the insights so gained to solve the problem. The 
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task generates two separate scores, one for the 
number of beliefs spontaneously overcome—the 
number of novel or changed responses—and one 
for the time to solution after they are overcome 
—a measure of the utilization of novel responses. 

Analyses of the data were performed by parti- 
tion of chi-square. The groups did not differ in 
their ability to produce novel or changed re- 
sponses, Rokeach’s rigidity concept, nor did they 
differ in their overall inflexibility. When analysis 
was made of the subjects’ ability to utilize the 
novel or changed responses which they had pro- 
duced, two separate chi-square analyses yielded 
values of 10.00 (df = 1, p < .01) and 8.00 (df= 
1, < .01), indicating a significant interaction of 
neuroticism and introversion. Introverted neu- 
rotics were highly inflexible on this measure— 
Rokeach’s dogmatism concept—while extroverted 
neurotics did not differ from non-neurotics. There 
were no significant main effects. We may con- 
clude that the groups were not different in their 
ability to produce novel responses, that is, in 
their rigidity, but that the HN-I group was 
relatively inferior in its ability to utilize such 
responses, that is, in its dogmatism. 

Other analyses failed to offer support for Rok- 
each’s hypothesis that subjects who are dogmatic 
tend to be hostile or rejecting toward the task. 

Two conclusions follow from this study. The 
first is that it is inappropriate to speak loosely 
of the “rigidity” or “inflexibility” of neurotic 
behavior, as it was shown that neurotics are not 
different from non-neurotics in their ability to 
produce novel responses, but only in their ability 
to utilize such responses. Secondly, not all neu- 
rotics manifest this inability; it is shown by 
introverted neurotics only. 
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DEPRESSIVE AFFECT, SPEED OF RESPONSE, AND AGE? 
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Two areas of investigation which have re- 
ceived much attention in the study of adult aging 
are affective depression and speed of response. 
“Depression is the most frequent psychological 
difficulty encountered in the aged . . . [Zinberg 
& Kaufman, 1963, p. 66],” and speed of response 
has been considered so important that a confer- 
ence was held on just this problem alone 
(Welford & Birren, 1965). 

Since depression and speed of response are 
both related to age, they may be related to each 
other, In fact, depression is often defined in 
terms of a “psychomotor retardation.” There is 
also some empirical evidence which bears on 
the possible relationship. D-scale scores of the 
Minnesota Multiphasic Personality Inventory 
(MMPI) and reaction times (RT) were cor- 
related 0.52 with men aged 65 and over, who 
were highly selected for health and who were 
medically diagnosed as physically normal. How- 
ever, there was no correlation between these 
variables with men of the same age who were 
selected in the same way, but who were found 
without detectable medical problems of any kind 
—a most unusual group (Butler & Perlin, 1963, 
p. 299), 

It is posible that with the increased variance 
which may be associated with an extended age 
range of subjects, and with no medical screen- 
ing, the correlation between depression and speed 
of response would be of greater magnitude. This 
study was designed to test this possibility. 
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Depression was measured by a true-false 
mimeographed form of the MMPI D scale, and 
speed of response was measured as simple audi- 
tory RT. The specific details of the RT measure- 
ments may be seen in a previous report (Bot- 
winick & Thompson, 1965), but of special rele- 
vance here is that RTs were measured in the 
context of four foreperiods (0.5, 3.0, 6.0, and 
15.0 seconds) within both regular and irregular 
series. The subjects were community-residing 
volunteer men and women of two age groups: 
67 to 87 and 18 to 35 (elderly men, N = 23; 
elderly women, N= 28; young men, N =37; 
and young women, N = 20). 

Pearson product-moment correlations were 
computed between the measures of depression 
and speed of reponse. There were 99 coefficients 
in all, keeping separate, and combining in vari- 
ous combinations, the categories of age, sex, 
foreperiod, and total regular and irregular RT 
series, Not one coefficient of correlation was 
statistically significant at the .05 level. Within 
the limits of the present data, therefore, no evi- 
dence of a relationship between depression and 
speed of response was seen. The older subjects 
had higher D-scale scores and slower RTs than 
did the younger subjects (p<.01), as has 
been reported in a variety of studies. 


REFERENCES 


Botwinick, J., & THompson, L. W. Premotor and 
motor components of reaction time. Journal of 
Experimental Psychology, 1966, 71, 9-15. 

Burter, R. N, & Perr, S. Physiological-psycho- 
logical-psychiatric interrelationship. In J. E. Birren 
et al. (Eds.), Human aging: A biological and be- 
havioral study. United States Government Printing 
Office, 1963. Pp. 293-300. 3 

Wetrorp, A. T., & Brrren, J. E. (Eds.) Behavior, 
aging and the nervous system. Springfield, m.: 
Charles C Thomas, 1965. 

Zeer, N. E, & KavrMan, I. (Eds.) Normal 
psychology of the aging process. Internatio 
Universities Press, New York, 1963. 


(Received January 13, 1966) 


106 


—— ee 


of Consulting Psychology 
, Vol. 31, No. 1, 107 


af 


JEROME M. SATTLER 
San Diego State College 


relationship between personality charac- 
ics and early recollections (ERs) was in- 
tigated. The hypotheses were that a consistent 
ationship would be established (a) between 
r Manifest Anxiety (MA) Scale scores and 
anxiety content of ERs, and (b) between R 
e scores of the Inventory of Factors STDCR 
the introversion-extroversion (I-E) content 


y-four college subjects (Ss) of both sexes 
ticipated. The two personality inventories 
routinely administered to all Ss during a 
arly scheduled class period. ERs were col- 
2 weeks after the inventories were ad- 
ministered. MA scale distribution was divided 
© thirds, while R scale distribution was di- 
ided at the median; thus, six subgroups were 
‘The Ss rated their first ERs for both anxiety 
| LE by using a different 6-point scale for 
h rating. Two judges, who had no knowledge 
Ss’ personality classifications, also rated the 
s using the the same 6-point scales. Instruc- 
were provided to both Ss and judges con- 
hing the rating scales. The Ss and judges also 
eived detailed information about the person- 
y characteristics which they were rating. 
Product-moment correlations were utilized for 
etmining the reliability of the ratings. Re- 
were as follows: interjudge anxiety ratings 
(p<.001); interjudge I-E ratings .42 
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(p <.001); mean judges’ and Ss’ anxiety ratings 
.64 (p <.001); mean judges’ and Ss’ I-E ratings 
48 (p <.001). Interjudge reliability was signifi- 
cantly higher for the anxiety ratings than for 
the I-E ratings (z = 4.07, p < .001), Four analy- 
ses of variance were performed using Ss’ and 
mean judges’ ER ratings. Significant main effects 
supporting the hypotheses appeared in three of 
the four analyses. Judges’ mean ER ratings, how- 
ever, failed to discriminate between the I and E 
groups. Interactions were not significant for any 
of the analyses, Product-moment correlations be- 
tween ER ratings and personality inventory 
scores supported the analyses-of-variance find- 
ings. Coefficients were as follows: MA scale 
and Ss’ anxiety ratings .45 (p<.001); R and 
‘Ss’ L-E ratings 50 (p<.001); MA scale and 
mean judges’ anxiety ratings .33 (p < 005); R 
and mean judges’ I-E ratings .17 (p > .05), Sex 
and personality groups were not significantly dif- 
ferent with respect to age of recall, The latter 
contrasts with Winthrop’s (1958) findings which 
showed that Ss describing themselves as intro- 
verts were more likely to report memories at 
an earlier age. Differences in $ population and 
in personality measures may account in part for 
the divergent results. 

The significant findings indicate that ERs have 
some utility in evaluating personality character- 
istics, The nonsignificant judges’ I-E ratings and 
the less reliable judges’ I-E ratings indicate that 
LE features of ERs are more difficult to evaluate 
than anxiety features. The agreement between 

' I-E ratings and their R scale scores suggests 
that they may have approached both tasks in 
a similar manner. Adler’s position, that ERs re- 
flect Ss’ style of life, receives some support from 


the study. 
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AN ABBREVIATION OF THE WISC FOR CLINICAL USE? 
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Satz and Mogel (1962) and Mogel and Satz 
(1963) devised an abbreviated form of the 
Wechsler Adult Intelligence Scale (WAIS) which 
satisfied both clinical usefulness and validity. 
The procedure involved the selection of items 
(46%) from 9 of the 11 WAIS subtests, leaving 
Digit Span and Digit Symbol unchanged. The 
advantages of this type of short form were two- 
fold: (a) all subtests were represented and (b) 
considerable time in administration was saved. 
High correlations between the abbreviated and 
standard forms were found regardless of intel- 
lectual level or diagnostic classification. In a 
recent study by Yudin (1966), it was sug- 
gested that the Satz-Mogel abbreviation was ap- 
plicable to the Wechsler Intelligence Scale for 
Children (WISC). Using an emotionally dis- 
turbed sample of children, Yudin found ex- 
tremely high correlations between the standard 
and abbreviated WISC subtests and scales for 
different age groups. Intellectual level, however, 
was shown toglower the magnitude of these cor- 
relations, particularly in the upper IQ ranges. 
If this finding were true, it would limit the use- 
fulness of the abbreviated procedure. The pres- 
ent study was therefore designed to test the 
replicability of Yudin’s findings, with this short 
form, on a new sample of emotionally disturbed 
children, 

The WISC records of 150 emotionally dis- 
turbed children were rescored according to the 
procedure outlined by Yudin. The correction 
factor, however, was not employed. The total 
group ranged in age from 5 years, 11 months, 
to 15 years, 11 months, with a mean of 12 years, 
3 months. The Full Scale IQ of the group ranged 
from 46 to 137, with a mean of 96.11 and a 
standard deviation of 17.60, These subject char- 
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acteristics closely approximated Yudin’s sample 
in terms of age, Full Scale IQ, N, and diagnosis, 
The correlations between the two forms for 


Verbal IQ (r =.96), Performance TIQ (r= "95)e 


and Full Scale IQ (r=.97) were all high and 
almost identical to those reported by Yudin. 
Similar findings were obtained among the verbal 
subtests (which ranged from .84 for Information 
to .90 for Arithmetic) and performance subtests 
(which ranged from .75 for Picture Completion 
to..91 for Block Design). Age was shown to 
have no appreciable effect on the magnitude of 
these correlations for either subtests or scales, 
Intellectual level, however, was shown to lower 
the correlation coefficients, particularly in the 
upper ranges (Full Scale IQ > 109): Verbal IQ 
=.72, Performance I1Q=.83, Full Scale IQ= 
-77. There was also considerable variation within 
the verbal and performance subtests for this 
group ranging from .46 for Object Assembly to 
-92 for Picture Arrangement. 

The present findings support the usefulness of 
the abbreviated WISC form, except for subjects 
in the Bright Normal and Superior Ranges of 
intelligence. These ranges comprise roughly 25% 
of the children in the normal population. Al- 
though the correlations between the two forms 
were impressive when analyzed on the combined 
sample and by age levels, the correlations (scales 
and subtests) did drop appreciably in the upper 
IQ ranges when analyzed by level of Full Scale 
IQ. Yudin’s study reported only correlations on 
Full Scale IQ in the different IQ ranges, which 
might have masked much of the subtest vari- 
ability. The abbreviated WISC, however, did 
predict successfully at the remaining IQ ranges, 
and at each of the three age levels. 
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STRATEGY OF OUTCOME RESEARCH IN PSYCHOTHERAPY * 
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The current status of psychotherapeutic research is reviewed, concluding that 
the greatest need is for outcome studies. The major variables and domains 
involved, in psychotherapy are delineated to show where errors have occurred 
in past investigations, and to serve as a basis for determining the degree of 
control necessary to answer the varied questions concerning the practice of 
psychotherapy. Strategic choices for accumulating knowledge are suggested in 
terms of the selection of variables, criteria, and adequate research designs for 
a given level of empirical knowledge. Contrary to many current statements, 
the present methodology of scientific psychology does appear adequate for 
evaluating psychotherapy; however, the value of different research approaches 
from case studies to factorial designs must be recognized and used strategically. 


Shlien (1966) has summarized the overall 
impact of the past 25 years of psychothera- 
peutic research by pointing out, “Continued 
subscription [to psychotherapy] is based 
upon personal conviction, investment, and 
observation rather than upon general evi- 
dence [p. 125].” Eysenck’s (1952) first re- 
view of the outcome literature concluded, 
“The figures fail to support the hypothesis 
that psychotherapy facilitates recovery from 
neurotic disorder [p, 323].” His more recent 
teviews (Eysenck, 1961, 1965) have led to 
essentially the same conclusions. In the face 
of such evidence, only two alternatives pre- 
sent themselves: (a) Psychotherapy does not 
“work”; that is, it is ineffective and should 
be abandoned, or (6) past studies have been 
inappropriate or inadequate evaluations of 
the efficacy of psychotherapy. The consensus 
of research workers who have considered the 
basic principles and methods for the evalua- 
tion of psychological treatment strongly 
favors the second alternative (e.g., Edwards 
& Cronbach, 1952; Hoch & Zubin, 1964; 
Rubinstein & Parloff, 1959; Strupp & Lubor- 
sky, 1962). 

Parloff and Rubinstein (1959) have sum- 
marized the sociological obstacles to progress 
in outcome research which, combined with 
Methodological difficulties in criterion defini- 
tion, resulted in what Zubin (1964) calls a 
“flight into process [p. 127].” Large scale 
Tesearchers came to focus on the process of 
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therapist-client interaction. Thus, sociological 
difficulties were circumvented by the impor- 
tance of personality theory, and criterion 
problems were eased by a focus on intra- 
therapy measures. The assumed relationship 
of such studies to treatment outcome rests 
on the common belief that more process 
studies are necessary to identify all impor- 
tant variables before evaluations can be 
meaningfully made (Hoch, 1964). Similarly, 
“The evaluation of the effects of therapy is 
not a task we can handle with existing tools 
[Hyman & Berger, 1965, p. 322).” 

A major problem with the process approach 
is that the importance of a variable or 
theory for outcome cannot be established 
without concurrent assessment of outcome 
(Greenhouse, 1964). It is precisely through 
outcome studies with concurrent measurement 
or manipulation of variables whose influence 
is unknown that important variables are 
likely to be identified. If the influence of all 
variables were known, the question of evalua- 
tion would be spurious from the beginning. 
Additionally, many statements such 5 aat 
of Hyman and Berger appear to result from 
a ARAA of what Reichenbach (1938) 
has called the context of discovery and the 
context of justification. While it may be true 
that psychological science may never be in 
a position to measure the truth of the com- 
plex experiences which take place between 
two or more persons (context of discovery), 
verifying the degree to which the goals of 
such an interaction are reached or not reached 
(context of justification) is logically no dif- 
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TABLE 1 


MAJOR VARIABLES (DOMAINS) INVOLVED IN 
PSYCHOTHERAPY RESEARCH 


1. Clients 
a. Distressing behaviors (cognitive, physiological, 
motoric) 
b. Relatively stable personal-social characteristics 
c. Physical-social life environment 


2. Therapists 
a. Therapeutic techniques 
b. Relatively stable personal-social characteristics 
c. Physical-social treatment environment 


3. Time 
a. Initial contact 
b. Pretreatment 
c. Initial treatment stage 
d. Main treatment stage 
e. Termination (pretermination stage) 
f. Posttreatment 
g. Follow-up 


ferent for psychotherapy than for any other 
change-agency (Sanford, 1962). 

Apart from emotional and sociological 
obstacles, the principles and methods of 
outcome research are basically the same as 
any other experimental design problem, ex- 
cept for the greater number and complexity 
of variables. As with all research in psychol- 
ogy, the basic purpose is to discover phe- 
nomena—behavioral events or changes—the 
variables which affect them, and the lawfulness 
of the effects. Likewise, the greatest difficulty 
has come from research errors, that is, dis- 
crepancies between what is concluded and 
what can be concluded as a consequence of 
the experimental operations. Unfortunately, 
the majority of outcome research has suf- 
fered from what Underwood (1957) terms 
“lethal errors”—discrepancies in which there 
is no way that a scientifically meaningful 
conclusion can be reached from the pro- 
cedures used. “These cases are best exempli- 
fied by blatant confounding of stimulus vari- 
ables from different classes (environmental, 
task, subject) so that behavior changes meas- 
ured cannot be said to be the result even 
of variables within a given class [Underwood, 
1957, p. 90].” 


VARIABLES IN PSYCHOTHERAPY RESEARCH 


The major variables or domains involved in 
psychotherapy research, irrespective of theo- 
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retical preconceptions, are summarized in 
Table 1. Levinson (1962) and Kiesler (1966) 
have also considered these basic problems. 
Each of the variables listed in Table 1 may 
be treated as independent variables through 
selection or manipulation, exerting main ef- 
fects and interactions within domains and 
between domains. 

The essential ingredients are at least one 
client and one therapist who get together 
over some finite period of time. Clients come 
to treatment in order to obtain help in 
changing some aspect of their behavior which 
they, or someone else, find distressing. These 
distressing behaviors (1a) may vary in num- 
ber and nature and may change over time. 
Clients may also vary on relatively stable 
personal-social characteristics (1b) such as 
age, intelligence, and expectancies. 1a and 1b 
thus comprise the usual experimental class of 
subject variables. Clients work with thera- 
pists who utilize therapeutic techniques (2a) 
through which they attempt to alleviate the 
distress of the client. Like the client’s dis- 
tressing behaviors, therapeutic techniques 
may vary in number and nature and may 
change over time. Therapists, just as clients, 
may vary on relatively stable personal-social 
characteristics (2b) and, in addition, on 
characteristics related to treatment, such as 
subscription to particular “schools” of thera- 
peutic theory, experience, and type of “con- 
ditions” established. Thus, 2a and 2b com- 
prise the usual experimental class of task 
variables. Although 1c and 2c comprise 
the usual class of environmental variables, 
they are listed with clients and therapists 
due to the greater likelihood of confounding 
adjacent classes. The client’s physical-social 
life environment (1c) includes essentially all 
the intercurrent life experiences impinging 
upon him outside of the treatment situation, 
for example, family, drugs, and work situa- 
tion. The physical-social treatment environ- 
ment (2c) refers to the institutional setting 
in which treatment takes place. This domain 
may vary from private to public, fee or no 
fee; it may be a hospital, clinic, or private 
office. 

The third category of Table 1, that of 
time, is usually considered a task variable. 
Time is separated here for expository PU” 
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poses because, in addition to task variation, 
the events in time also mark points of re- 
search focus. As a task variable, time may 
vary in terms of length of treatment contact 
and number of sessions, that is, the time 
between pretreatment (3b) and posttreatment 
(3f). Within treatment proper (3c, d, e) time 
may vary within and between different stages. 
Likewise, time may vary between initial con- 
tact (3a) and pretreatment (3b) and between 
posttreatment (3f) and follow-up (3g). The 
second aspect of the time dimension delineates 
points of research focus for study or measure- 
ment of the main effects and interactions of 
variables, either between time periods or 
within time periods. 


QUESTION PROBLEM 


From the above list of variables and the 
corresponding points in time for their occur- 
rence and measurement, it is possible to de- 
termine the necessary operations for obtain- 
ing answers to specific questions and, 
conversely, to see where research errors have 
occurred. The most obvious problem with 
past outcome research has been in the stage 
of asking questions. The initial question 
posed, “Does psychotherapy work?” is 
virtually meaningless. Psychotherapy compre- 
hends a most diversified set of procedures 
ranging from suggestion, hypnosis, reassur- 
ance, and verbal conditioning to systematic 
sets of actions and strategies based upon 
more or less tight theoretical formulations. 
Narrowing the question to specific schools of 
Psychotherapy, such as “Does client-centered 
therapy work?” or “Does behavior therapy 
Work?” is no more meaningful than the 
general question, since the range of pro- 
cedures remains as diversified within schools 
as within psychotherapy in general. Further- 
more, such questions fail to take into account 
the characteristics of therapists (2b) which 
may contribute to efficacy. Even if psycho- 
therapy were a homogeneous entity, these 
questions fail to specify the “what” for “does 
it work?” Here again, the wrong questions 
Were asked. “Does it work for neuroses?” 
“Does it work for schizophrenics?” As with 
Schools of psychotherapy, the range of indi- 
vidual differences within standard diagnostic 
categories remains so diversified as to render 


Meaningless any questions or statemen 
about individuals who become so labele 
The labeling process itself is notoriously ur 
reliable, and criteria for inclusion in a par 
ticular class are vague and overlapping. Ques 
tions asked in this manner are doomed t 
committing lethal errors from their tim 
of conception, allowing confounding ani 
confusion of Domains 1 and 2 from th 
beginning. 

The third problem with questions that have 
been asked in outcome research is with the 
term “work” itself, that is, the criteria of 
success or improvement. While there may be 
general agreement that if psychotherapy 
works, the client will “feel better” and 
“function better” (Frank, 1959), the - spe- 
cific goals in accomplishing this end will be 
as varied as the problems which are brought 
to treatment. Without specifying the “what” 
in a question of the nature, “For what does 
it work?” the question of success remains as 
confused and heterogeneous as the domains 
of psychotherapy and clients at large. 

What is the appropriate question to be 
asked of outcome research? In all its com- 
plexity, the question towards which all 
outcome research should ultimately be di- 
rected is the following: What .treatment, by 
whom, is most effective for this individual 
with that specific problem, and under which 
set of circumstances? Relating the basic ques- 
tion to the domains listed in Table 1, we 
find: What treatment (2a, therapeutic tech- 
niques), by whom (2b, therapists with rela- 
tively stable personal-social characteristics), 
is most effective (change in 1a, the client’s 
distressing behaviors, from 3b, pretreatment, 
to 3f and g, posttreatment and follow-up) for 
this individual (1b, clients with relatively 
stable personal-social characteristics), and 
under which set of circumstances (1c, the 
client’s physical-social life environment, and 
2c, the physical-social treatment environ- 
ment)? Posed in this manner, two points 
are obvious: (a) No single study of any 
degree of complexity will ever be capable of 
answering this question, and (b) in order for 
knowledge to meaningfully accumulate across 
separate studies and provide a solid empirical 
foundation for subsequent research, it will be 
necessary for every investigation to ade- 
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quately describe, measure, or control each of 
the variables or domains listed in Table 1. 


CRITERION PROBLEM 


One of the most recurrent methodological 
difficulties in outcome research has been the 
criterion problem. Lack of agreement among 
criteria, not only between investigators, but 
between clients, therapists, and other sources 
in the same investigation was one factor which 
resulted in the flight into process. A major 
basis for the lack of relationship among cri- 
teria lies in the fact that different frames of 
reference may be used by persons in different 
roles for making overall judgments of success 
or improvement. Most investigators have se- 
lected criteria from some theoretical frame of 
reference, ignoring the heterogeneity of client 
populations. Criteria so selected are likely to 
be, at best, partially related to criteria se- 
lected from some other frame of reference, 
Parloff and Rubinstein’s (1959) statement 
that an “investigator’s selection of specific 
criteria [is] a premature and presumptuous 
value judgment [p. 278]” appears valid when 
the criteria are based upon some preconceived 
theoretical judgment which bears no demon- 
strated relationship to the client’s problems, 
or when the criteria are deemed to be the 
attainment of some “ideal” which is neces- 
sarily value laden with the mores of a par- 
ticular class, culture, or investigator. The 
normal population is heterogeneous in the 
extreme, allowing for broad ranges of varia- 
bility in ways of living. Irrespective of any 
theoretical position, the real question of out- 
come on logical and ethical grounds is 
whether or not the clients have received help 
with the distressing behaviors which brought 
them to treatment in the first place (Betz, 
1962; Hoppock, 1953; Rickard, 1965.) As 
recently stated by Jerome Frank and his col- 
leagues (Battle, Imber, Hoehn-Saric, Stone, 
Nash, & Frank, 1966), 


In the absence of adequate knowledge of the causes 
of psychiatric complaints, we assume that psycho- 
therapy has removed the causes if the complaints 
are permanently relieved, and no new ones are sub- 
stituted for them [p. 185]. 


While using such tailored criteria does by- 
pass the homogeneity problem which arises 
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from the use of varying frames of reference, 
it does not remove the necessity for ade- 
quately defining the dependent variables at 
some level of quantification with adequate re- 
liability and validity. Although many tech- 
niques for measuring change have been used 
in the past, few of these methods have proved 
to be acceptable (Zax & Klein, 1960), Sub- 
jective reports of change by clients or thera- 
pists are notorious for their lack of reliability 
and validity, and specific problems negate the 
use of many psychological tests (Paul, 1966). 

The most important and meaningful test of 
outcome is the change in clients’ distressing 
behaviors outside of treatment (Luborsky & 
Strupp, 1962). Some guidelines for assessment 
may be obtained by considering the process 
which results in client-therapist contact; 
that is, the client does something, under a 
set of circumstances, which disturbs someone 
sufficiently that action results—entering 
treatment (Ullmann & Krasner, in press). 
The “something” he does has been identified 
as the distressing behaviors which lead him to 
treatment. However, since this something oc- 
curs under a set of circumstances it is un- 
necessary to attempt measurement of change 
under all circumstances, Rather, assessment 
should be more or less situation specific. AS 
Zax and Klein (1960) point out, the least 
used but most promising criteria are, then, 
objective behavioral criteria external to the 
treatment situation. The advantages of such 
“work sample” assessments, which may in- 
clude self-report measures, have been pre 
sented elsewhere (Paul, 1966). While multi- 
ple measures of outcome are necessary, the 
dependent variable in any outcome evaluation 
must be, to return to Table 1, change in the 
distressing behaviors which brought the cli- 
ent to treatment (1a), from pretreatment 
(3b) to posttreatment (3f), and follow-up 
(3g), assessed external to treatment propel 
by unbiased means, 


APPROACHES To Outcome RESEARCH 


Given the basic question and criterion speci- 
fications, as with all research, the means 0 
obtaining answers or partial answers becomes 
a problem of strategy. That is, what is the 
place of and need for different levels of out 
come research? What can the varied aP- 
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proaches such as case studies, simplifica- 
tions, and different levels of controlled ex- 
periment contribute? In view of the range and 
complexity of the variables involved, con- 
tinuing series of well-controlled, factorially 
designed experiments appear to be not only 
the most efficient means to obtain knowledge 
relevant to the ultimate question (Edwards 
& Cronbach, 1952), but probably the only 
way. Both before and after factorially de- 
signed experiments there is a real value for 
lower levels of investigation and for studies 
designed to answer different questions, espe- 
cially in the determination of mechanisms of 
change. However, these investigations must 
be evaluated on the basis of their possible 
level of product. 

Since outcome studies attempt to deter- 
mine cause-effect relationships, the ultimate 
necessity of factorial investigations is appar- 
ent upon consideration of two principles of 
basic research: (a) There is really only one 
principle of experimental design, and that is 
to “design the experiment so that the effects 
of the independent variables can be evaluated 
unambiguously [Underwood, 1957, p. 86],” 
and (b) in order to do this, “to draw a con- 
clusion about the influence of any given vari- 
able, that variable must have been systemati- 
cally manipulated alone somewhere in the 
design [Underwood, 1957, p. 35].” These 
principles again highlight the need of de- 
scribing, measuring, or controlling each of 
the variables or domains listed in Table 1 to 
Prevent confounding. 

The problem of necessary experimental con- 
trols for the prevention of confounding 
(Frank, 1959) is clear in factorial designs. 
Additionally, tactical decisions within a fac- 
torial study need to be made on the basis of 
the Strength of knowledge in an area at any 
given time. Since the major points concern- 
ing needed controls and the practical and em- 
pirical problems of conducting factorial out- 
Come studies have been presented elsewhere 
(Paul, 1966), these points will only be sum- 
marized here, with focus on strategies which 
1 ae desirable at our current level of knowl- 
edge. 

An adequate definition of the client sample 
and the related practical problems of provid- 
Mg sufficient time for appraisal, adequate in- 
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formation prior to treatment assignment, and 
large enough groups constitute the first con- 
cern of the outcome study. Since change in 
clients’ distressing behaviors (la) is always a 
dependent variable, selection of the sample 
involves a decision on heterogeneity. In view 
of the likelihood that the severity of distress- 
ing behaviors will vary, even within the same 
class (e.g., obese women), and that resistance 
to change may vary across classes (e.g., obese 
women versus morphine addicts), the tactical 
choice favors selection of clients on the same 
class of target behaviors. 

Even with selection on a homogeneous 
class of target behaviors, there is likely to 
be a wide variation in relatively stable per- 
sonal-social characteristics ( 1b). Clients might 
be classified on the basis of 1b variables, and 
these classes might then be treated as inde- 
pendent variables (e.g., elderly, lower class 
males of average intelligence versus adoles- 
cent, middle-class females of superior intelli- 
gence). However, practical considerations sug- 
gest that the best strategy for early studies 
would be the selection of homogeneous sam- 
ples described in enough detail for meaning- 
ful comparisons to be made across studies. 
With quantification of major characteristics, 
correlational analyses of client character- 
istics with outcome can provide suggestive 
evidence of possible influencing parameters 
and thus sharpen independent variables for 
subsequent outcome studies. 

While the distressing client behaviors in 
Domain 1a will always be involved as inde- 
pendent and dependent variables, and while 
the relatively stable personal-social character- 
istics in Domain 1b may be independent 
variables, described or controlled, the varia- 
bles in Domain 1c, the physical-social life en- 
vironment of the clients, can seldom be de- 
scribed in detail or treated as independent 
variables, The task then becomes one of con- 
trol, that is, to provide for the eventuality 
that behavioral modifications which may be 
observed are not due to extraexperimental life 
experiences. The only way of controlling for 
these factors appears to be to use a compara- . 
ble no-treatment control group that is ob- 
served and assessed at the same time and for 
the same amount of time as the treatment 
groups. Such a group also controls for the 
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effects of repeated testing and for so-called 
“spontaneous” changes over time. 

Own-control designs (i.e., comparisons of 
3a-3b change with 3b-3f change for the same 
clients) are desirable for obtaining base rates 
on the stability of distressing behaviors, but 
they do not adequately control for changes 
which may be related to the passage of time, 
season, or extraexperimental experiences. 
Therefore, the problem becomes one of divid- 
ing the sample into equivalent experimental 
and control groups. Experimentally, there are 
three ways of obtaining equivalent groups. 
The first method is to match groups, not only 
on target behaviors, but also on all major 
variables believed to be significant from Do- 
mains 1b and 1c, randomizing on other as- 
pects. A second possibility is to equate groups 
by stratified sampling of major categories 
without matching individuals. The third 
method is straight random assignment. For 
any method, randomization of variables not 
matched or measured is important. In fac- 
torial studies, the present strategical choice 
appears to favor, at least, stratification on 
target behaviors and motivation. The danger 
of experimental procedures destroying the 
equivalence of the no-treatment control group 
in relation to other groups may be partially 
circumvented by strategic use of an own-con- 
trol waiting period for a later treatment as 
the no-treatment control for current treatment 
groups. 

Constituting the next concern are an 
adequate definition of the therapists and tech- 
niques of treatment and the related prac- 
tical problems of providing sufficient time for 
assessment or training prior to treatment con- 
tact, monitoring of intratherapy procedures, 
and obtaining enough cooperative individuals. 
The usual independent variable of most in- 
terest in the outcome study is the specific 
therapeutic technique (2a) proposed to be 
effective in alleviating behavioral distress, A 
decision on heterogeniety is even more im- 
portant with regard to therapeutic technique, 
since the replication of independent variables 

is involved. In dealing with established treat- 
ment procedures, this problem becomes even 
more complicated, because most experienced 
psychotherapists have developed their tech- 
niques to a more or less individual art. One 
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strategic approach to this problem would be 
to allow complete flexibility among therapists 
with their preferred techniques, determine 
those who are reliably effective, and then re- 
turn to audio or video recordings of the ef- 
fective therapists to determine what they did, 
or conduct more elaborate process studies 
with these therapists as subjects. The diffi- 
culty with this approach is that it provides no 
information on what is ineffective treatment, 
nor does it allow ready comparison to other 
therapeutic techniques. 

On the other hand, if each set of treatment 
techniques is relatively homogeneous within 
treatment groups across therapists, immediate 
knowledge of what constitutes effective treat- 
ment would be available. Thus, whether deal- 
ing with old or new treatment procedures, the 
tactical choice favors homogeniety within 
groups, preferably provided with a single 
monitor-supervisor for all therapists within 
individual treatments. 

By using homogeneous treatment proce- 
dures, the chief problem of control is to dis- 
tinguish between the effects of Domains 2a 
and 2b, that is, the effects of the relatively 
stable personal-social characteristics of the 
therapist versus those of the specific set of 
therapeutic techniques. A related problem is 
that of distinguishing between the specific 
effects of particular therapeutic techniques 
and the nonspecific effects which are involved 
in any therapeutic contact—the “placebo- 
effect” (Rosenthal & Frank, 1958). An ade- 
quate control for placebo effects would then 
be another form of treatment in which clients 
have equal faith, but which would not be ex- 
pected to lead to behavioral change on any 
other grounds. By having each therapist con- 
duct both a treatment to be evaluated and 
an attention-placebo treatment with several 
clients, equating the length and number of 
sessions and all other time factors, not only 
may nonspecific placebo effects be distin- 
guished, but a base rate is provided for the 
improvement resulting from the relatively 
stable personal-social characteristics of the 
individual therapists. As with client charac- 
teristics (1b), therapist characteristics (2b) 
may be treated as additional independent 
variables (e.g., experienced versus inexpett 
enced). However, in view of the present state 
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of knowledge, the best strategy for early stud- 
ies might limit evaluation of therapist char- 
acteristics to correlation with outcome to aid 
in sharpening of hypotheses for future evalu- 
ation. 

In practice, the physical-social treatment 
environment (2c) will probably be constant 
for any given investigation. In this case, the 
major requirement is an adequate description 
of the facility and usual operating procedures 
to allow comparisons across studies. Should 
more than one facility be involved, however, 
it is important to control for possible con- 
founding by conducting all treatments, in- 
cluding the attention-placebo, in all facilities, 
While the separate influence of the treatment 
setting itself appears, within normal limits, 
to be the least important domain, settings 
could be evaluated as an independent variable 
by having each therapist conduct each of the 
different treatments with several clients in 
two or more facilities. 

With regard to the time dimension, each of 
the time periods listed in Table 1 should be 
specified and held constant, unless time is to 
be treated as an independent variable. As- 
sessments should be taken at the same points 
in time and the number and duration of ses- 
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sions held constant across groups. If time is 
to be treated as an independent variable, for 
example, in questioning whether spacing or 
duration of sessions were important, it would 
also be necessary to systematically vary these 
aspects across relevant control groups. 


If the investment of time and money is 
made for all of the above therapists and con- 
trols, it would also appear to be good strategy 
to evaluate two or more treatments within the 
same design. This could be accomplished by 
extending therapists across more than one 
treatment, in addition to the attention-placebo 
treatment, or by introducing different thera- 
pists into the design to conduct both a dif- 
ferent type of treatment and the attention- 
placebo treatment, In either case, main effects 
may be evaluated for treatments and for 
therapists, as well as interactions. It would 
also seem reasonable to choose the most prom- 
ising competing treatments for any particular 
disorder for comparative evaluation. Further, 
if these treatments were derived from com- 
peting theoretical formulations, an additional 
contribution could be made to basic science, 
as well as to the ethical-technological aspects. 

Since factorial studies of the above type do 
involve a tremendous investment of time, 


TABLE 2 
SUMMARY or Major Approacues To OUTCOME RESEARCH 


Level of product 


Approach Confounding possible 
Ca i ent) Within & between all domains | Crude hypotheses. ? 
a Within & between all domains | Correlational conclusions. Strength- 


Case study (with measurement) 


„Nonfactorial group design (without 
no-treatment control)* 


Nonfactorial group design (with 


no-treatment control) 


Factorial group design (with no- 
treatment control & attention- 
placebo control) 


Laboratory simplification 


Within & between all domains 


Within client (1a, b) 
Within treatment (2a, b, c) 


None necessary 


None necessary 


ened hypotheses. 

Same as above.” Hypotheses strength- 
ened as individual studies move 
across domains. 

Antecedent-consequent relationship 
established between classes, Deter- 
minants strengthened as individual 
studies move across domains. 

Antecedent-consequent relationship 
established for specifics within and 
between classes. Analytic conclu- 
sions for complex variables. 

Antecedent-consequent relationship 
established for specifics within & 
between classes for analogues. 
Analytic conclusions for specific 

variables. 


“Lower possible confounding, higher level product for “A-B-A own-control” approach (see text). 
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money, and personnel, it is important to con- 
sider the place and value for lower levels of 
investigation and for studies designed to 
answer different questions both before and 
after factorially designed experiments. A 
summary of the major approaches relating to 
outcome research, along with a designation of 
the confounding possible and resulting level 
of knowledge obtainable for each is presented 
in Table 2. As indicated above, the factorially 
designed experiment with no-treatment and 
attention-placebo controls is the only ap- 
proach which offers the establishment of an- 
tecedent-consequent relationships for specific 
treatments without possible confounding. With 
this type of design, analytic conclusions may 
be reached for complex variables, such as a 
total treatment system. 

Once the effectiveness of a complex treat- 
ment is established across specified problems, 
populations, and therapists, a number of these 
alternative approaches become valuable re- 
search strategies. One of the most valuable 
approaches is that of “simplification” (Bor- 
din, 1965), that is, abstracting from the origi- 
nally observed phenomena in the clinical set- 
ting and transferring them to the laboratory, 
where greater precision can be obtained in ex- 
perimental isolation, manipulation, and con- 
trol. The term experimental analogue is often 
given to this approach, since in the process of 
simplification, one or more of the variables 
or domains relevant to psychotherapy is 
deemed to be analogous to those existent in 
the natural clinical setting (Maher, 1966). 
To the extent that the analogue shares the 
essential characteristics of clinical procedures 
and phenomena, this approach is a powerful 
and economical means for determining the 
mechanisms of operation and specific parame- 
ters of influence for complex variables identi- 
fied in the factorial group design. However, 
any changes in procedure or hypotheses. de- 
veloped for other phenomena which grow 
from experimental analogues need to be con- 
firmed in the clinical setting. 

While both laboratory simplifications and 
factorial group designs in the clinical setting 
allow analytical conclusions to be reached, 
lower levels of investigation also have stra- 
tegic value for the ultimate question of out- 
come. Nonfactorial group designs with no- 
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treatment controls can establish antecedent- 
consequent relationships between treatment 
and outcome in the same manner as factorial 
designs. However, since the nonfactorial de- 
sign cannot separate within-class confound- 
ing, its utility must be considered in relation- 
ship to the available knowledge concerning 
the effectiveness and applicability of treat- 
ment techniques. Following a factorial study, 
the nonfactorial design may be valuable in 
extending treatment evaluation across do- 
mains, that is, to different types of clients, 
problems, therapists, treatment settings, and 
variations in the time domain. The limiting 
factor with this usage is that, for the accumu- 
lation of knowledge to remain precise, new 
variation can be introduced into only one do- 
main at a time. The second strategic use of 
the nonfactorial design with a no-treatment 
control is to provide, prior to the factorial 
experiment, global validation of promising 
treatment procedures and of new combina- 
tions of known methods. Since confounding 
is possible within classes of variables, from a 
scientific point of view, the latter usage serves 
only a mapping function. Practically, how- 
ever, this mapping function has considerable 
value; only the promising therapists or treat- 
ment procedures need be included in later 
factorial outcome studies, and only effective 
treatment procedures need be continued in 
clinical practice. Additionally, therapists or 
techniques which cause clients to get worse 
can be immediately identified and redirected. 

The three lower approaches in Table 2 (the 
nonfactorial design without no-treatment con- 
trols and case studies with or without meas- 
urement) cannot provide evidence of ante- 
cedent-consequent relationships because con“ 
founding is possible between domains and 
classes as well as within. The individual case 
study without external measurement is of use 
only in the earliest phase of the clinical de- 
velopment of techniques. The case study with 
measurement before, during, and after treat- 
ment constitutes the first step in validation 
by establishing correlational conclusions and 
communicating procedures to others. The by- 
potheses of a promising treatment procedure 
and its parameters can be strengthened 4S 
case studies accumulate across domains and 
through the uncontrolled nonfactorial design; 
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however, these approaches can at best serve 
an early, crude mapping function and can 
never validate a specific technique. One par- 
ticular approach involving a nonfactorial 
group design without no-treatment controls 
does not quite fit with these statements. This 
is the “A-B-A own-control” design in which 
the client’s distressing behavior is reduced, 
increased, and again reduced contingent upon 
therapeutic techniques (Ullmann & Krasner, 
1965). By demonstrating these temporal rela- 
tionships reliably across groups, the likeli- 
hood of between-class confounding by spon- 
taneous fluctuation in time, or extraexperi- 
mental life experiences is quite low. The level 
of product for this design approaches that of 
the nonfactorial group design with no-treat- 
ment controls. 

One other approach to outcome research 
which appears from time to time is the “retro- 
spective study” in which clinical records are 
searched to obtain “measurement” or descrip- 
tion on all variables, including the outcome 
measure itself. Although the rationale of such 
studies may be the same as prospective ex- 
periments, the crudeness of the data and 
methodological problems involved are so 
nearly insurmountable that these studies ap- 
pear to contribute little more than confusion. 
With careful application of appropriate meth- 
odology and strategy, hope exists that 25 
more years of research will no longer find 
psychotherapy characterized as “an undefined 
technique applied to unspecified problems 
a unpredictable outcome [Raimy, 1950, p. 
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Each of 4 therapists saw each of 4 patients in counterbalanced order for 1 
psychotherapeutic interview lasting approximately 20 minutes. Each of 5 vari- 
ables (total activity, percentage of feeling words, percentage of action words, 
number of questions, and number of reinforcements) was scored separately 
for each patient and each therapist for each of 4 sections of each interview. 
The results indicated that both therapist and patient behaviors were deter- 
mined by the therapist, the patient, and the Particular Patient X Therapist 
interaction and that therapists showed more variation in their behavior 
with different patients than patients showed in their behavior with different 


therapists, 


Research emphasis in psychotherapy has 
gradually changed over the years. The earlier, 
almost exclusive emphasis on patient charac- 
teristics has recently been balanced by an 
increased attention to the therapist’s con- 
tribution to the psychotherapeutic process 
(e.g., Goldstein, 1962; Strupp, 1962). Such 
studies have indicated that although it may 
be profitable to study patient and therapist 
characteristics by themselves, it is even more 
important to investigate how these two facets 
interact (Luborsky & Strupp, 1962). 

Such “interaction studies,” although as- 
suming interaction, have, for the most part, 
ignored therapist changes and have focused 
almost exclusively on how patient and thera- 
pist characteristics interact to produce a 
change in the patient or in some variable 
presumed to be primarily related to the pa- 
tient, Thus such dependent variables as 
number of therapy visits, “therapeutic suc- 
cess,” patient attitudes, and level of patient 
conditioning—all related primarily to patient 
behavior—have been used to assess the effect 
of patient-therapist similarity on the MMPI 
(Carson & Heine, 1962) and on the Myers- 
Briggs Type Indicator (Mendelsohn & Geller, 
1965), whether a therapist with a certain 
Strong vocational inventory profile treats a 


‘The manuscript is based, in part, on a paper 
Presented at the Sixth International Congress of 
Psychotherapy in London, England in August 1964. 
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psychotic or neurotic patient (McNair, Cal- 
lahan, & Lorr, 1962), and the “compatibility” 
of therapist and patient “needs” (Sapolsky, 
1965). 

Despite this tendency to focus on patient 
behavior, there have been several investi- 
gators who have used therapy analogues to 
study how Patient X Therapist interactions 
affect the therapist. Kemp (1964) found that 
undergraduate subjects classified as type “A” 
or “B” on the basis of their Strong voca- 
tional inventory responses differed in their 
ability to choose “interventions” and in their 
feelings of discomfort as a function of the 
type of patient they listened to. In a related 
experiment, Carson, Harden, and Shows 
(1964) reported that the ability of a type 
“A” or type “B” interviewer to obtain in- 
formation depended upon the type of inter- 
viewee. Even these therapy analogue studies, 
however, did not simultaneously measure the 
behavior of both the “therapist” and the 
“patient.” This type of analysis would appear 
to be the next logical step in “interaction 
studies.” Lan 

Such a step would require viewing the 
psychotherapy situation as a system. A sys- 
tems orientation implies true interdependence 
between the units of the system so that 
changes in one part are reflected in changes 
in the state of the other parts (Miller, 1965). 
This orientation implies not only that thera- 
pists can influence and change patients, but 
that, conversely, patients can influence and 
change therapists. Most psychotherapists 
would probably agree with this in principle. 
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However, a more informative and crucial 
problem is the extent to which specific aspects 
of the therapeutic process are “therapist 
determined,” “patient determined,” or are an 
interaction between the two units of the sys- 
tem. Lennard and Bernstein (1960) found a 
systems framework useful in the analysis of 
Patient X Therapist interaction, and they also 
found that patients could influence therapists. 
They were not able, however, because of their 
experimental design, to specify the relative 
influence of therapist and patient. Such a 
question can only be answered by a design in 
which each of several therapists sees each of 
several patients for one or more psycho- 
therapy sessions, 

This type of design is not feasible in most 
therapeutic situations. However, several set- 
tings in which such a design is possible have 
been established. One such setting is a clinic 
at the Palo Alto-Stanford Hospital, in which 
this study was conducted. A somewhat similar 
investigation was conducted by Van der Veen 
(1965), whose report was published after the 
data for this study had been collected. 

Van der Veen analyzed, according to a 
client-centered orientation, tape recordings of 
two interviews from each of three patients, 
each of whom had been seen by five thera- 
pists. The patients were part of a ward of 
chronic hospitalized schizophrenics who were 
free to visit or not to visit, as they wished, 
with therapists. The interviews were rated on 
two patient variables (problem expression 
and immediacy of experiencing) and two 
therapist variables (congruence and accurate 
empathy). The general findings were that the 
interview behavior of the patient was a 
function of the patient, the therapist, and 
the particular Therapist X Patient interaction; 
and that the interview behavior of the thera- 
pist was a function of the therapist and the 
patient. It was concluded that both the 
patients and the therapists significantly 
influenced each other’s therapeutic behavior. 

Van der Veen pointed out three important 
limitations of his study: the basic limitation 
of the generality of the results due to the 
lack of random patient-therapist selection; 
the problem of the high level of inference 
and the only moderate reliability achieved 
in rating the variables; the possibilities, since 
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the raters heard both the therapist and the 
patient on the tape segments, that the rating 
of the patient might have been influenced by 
the level of therapist behavior; and the rating 
of the therapist influenced by the level of 
patient behavior. There is, then, the ne- 
cessity of repeating a similar factorial de- 
sign measuring different dependent variables 
in a different population of patients and 
therapists. 

The present study is both a partial replica- 
tion and an extension of Van der Veen’s, in 
that the patients were selected from an out- 
patient instead of an inpatient population, 
the therapists were all second-year psychi- 
atric residents who had received similar pro- 
fessional training, the variables were each 
analyzed separately for both therapists and 
patients, and very little inference was needed 
in scoring the variables. This last feature 
deserves to be emphasized. It is especially 
important in factorial designs using clinic pa- 
tients to incorporate significant measures 
which still require only a relatively low 
amount of inference. Since it is difficult to 
acquire a large sample using such an ex- 
perimental design, there is a particularly 
strong need for replication from other sources 
using different parameters but the same vari- 
ables. It is much more difficult and some- 
times almost impossible to use variables from 
other studies when the raters have to be 
trained extensively and the ratings are highly 
subjective. 

The methodological model used in the pres- 
ent study is similar to that of Van der Veen, 
in that it considers three possible sources 
of determinants for the behavior of both the 
patient and the therapist. The three sources 
which need to be taken into account iM 
understanding either the patient’s or the 
therapist’s behavior are the patient, the thera- 
pist, and the interaction related to the 
particular patient-therapist combination. 

The utilization of an experimental desig? 
in which several individuals are each assessed 
in each of several interpersonal settings ca” 
also test a different, important theoretic 
issue—the degree to which behavior is C00- 
sistent across situations. How consistent is 4 
therapist’s behavior across psychotherapeut! 
interviews with several different patients 
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How consistent is a patient’s behavior across 
psychotherapeutic interviews with several 
different therapists? The problem of the rela- 
tive consistency of individual behavior across 
settings has been an important one in the 
study of nearly every personality trait since 
Hartshorne and May’s (1928) studies of hon- 
esty. Their conclusions that honest behavior 
was quite specific to each setting and that 
one could not generalize about a subject’s 
honesty from a few samples of his behavior 
are well known, as are Allport’s (1937) theo- 
retical reinterpretations of their data and 
Burton’s (1963) statistical reinterpretations 
of them. There have been, however, relatively 
few further empirical attempts to study 
systematically the consistency in individual 
behavior across several settings. 

It is possible that therapists might be rela- 
tively consistent in emitting therapeutic be- 
haviors across different patients. It is also 
possible, however, that different patients 
might elicit different therapeutic behaviors in 
the same therapist. Similar considerations 
apply to the consistency of patient behav- 
iors. One of the major purposes of the study, 
then, was to provide some empirical evidence 
on the relative consistency of patient and 
therapist behaviors across different interper- 
sonal settings. 


METHOD 
Design 


The study was carried out in a brief contact 
clinic which was already arranged so that each pa- 
tient in the clinic would see a diferent clinic thera- 
pist each week. The four female patients utilized 
constituted a selected sample of admissions to the 
clinic during the 2-month period preceding the 
actual beginning of the study. 1 

Patients were admitted to the clinic on the basis 
of the usual clinic criteria which included: (a) Pa- 
tients who had difficulty establishing a good, rela- 
tively prolonged relationship with one therapist 
and who might still be helped by a prolonged con- 
tact with a helpful agency, even though, or possibly 
because, they would be seeing a different therapist 
each week. The concept of “institution transference’ 
Was important in this connection. (b) Patients who 
had a history either of many hospitalizations or of 
many previous unsuccessful attempts at psycho- 
therapy, (c) Patients with multiple medical symp- 
tomatology, for example, hypochondriasis or psycho- 
physiological reactions, who had adequate provisions 
for necessary conjoint medical treatment, (d) Pa- 
tients who had particular difficulty in handling the 
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TABLE 1 
DESIGN 
Patient (P) 
Thera- 
pists (T) PA PB PC PD 
TW 1 2 4 3 
TX 4 1 3 2 
TY: 2 3 1 4 
TZ 3 4 2 1 


usual insight-oriented “talking cure” and who were 
likely to become easily frustrated or disappointed 
with one therapist. 

The four male therapists utilized constituted a 
random selection of the psychotherapists functioning 
in the clinic, The therapists were all psychiatric 
residents who were in the middle of their second 
year of residency training. 

Each of the four therapists saw each of the four 
patients in counterbalanced order for one psycho- 
therapeutic interview lasting approximately 20 min- 
utes, The interviews were spaced 1 week apart. 
Table 1 shows the design of the study. 

Each of the 16 interviews was tape-recorded and 
transcribed, and each was divided into four equal 
time sections. Thus, this was a 4 X 4 factorial de- 
sign with measures derived from each of the four 
different sections of each interview. Each section 
of each interview was scored on the patient and 
therapist variables. 


Patient and Therapist Variables 


Easily quantifiable and reliably scorable patient 
and therapist variables were utilized in the statistical 
analysis of the Patient X Therapist interactions, The 
same variables were used for both patients and 
therapists. The five variables used were as follows: 
(a) Total activity was measured by counting the 
total number of words spoken, (b) Percentage of 
feeling words was measured by counting the total 
number of feeling words and dividing this total by 
the total number of words spoken. Feeling words 
were defined simply by enumerating examples of 
words directly relevant to an individual’s affective 
state, for example, happy, afraid, tired, scared, angry, 
depressed, lonely, and bored. (c) Percentage of 
action words was measured by counting the total 
number of words referring to actions and 
dividing this total by the total number of 
words spoken. Action words were also defined 
simply by enumerating examples of words di- 
rectly relevant to possible overt behaviors in which 
an individual might be involved, for example, 
making, doing, reprimanding, talking, swimming, and 
drinking. (d) Number of questions asked was meas- 
ured by counting the total number of questions 
asked by each of the participants in the interaction. 
(e) Number of reinforcements was measured by 
counting the total number of “Mm-hmm’s” emitted. 
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TABLE 2 


SUMMARY OF ANALYSES OF VARIANCE OF THERAPIST VARIABLES 


Number of Percentage of Percentage of | Number of Number of 
words feeling action questions reinforcements 
Source df MS F MS F MS F MS F MS F 
Between patients 3 1400.53 6.73** 12.27 4.13* 23.83 3.16* 4.26 9.91** .73 1.09 
Between therapists 3 1184.50 5.69%* 3.70 1.24 13.70 1.82 23 å - 34.37 51.30% 
Between sections 3 7471.77 3.59* 3.93 1.32 13.53 1.79 2.33 5.42** 30 = 
Patients X Therapists 9 249.79 1.20 11.46 3:85%* 1823 2.42% 10 — 1.52, 2.20% 
Patients X Sections 9 302.88 1.46 2.60 — 8.01 1.06 69 1.60 37 - 
Therapists X Sections 9 150,09 — 2.28 — 8.54 1.13 69 1.60 AS = 
Residual 27 208.06 2.98 7.54 43 .67 
Total 63 
bå «05, 
prr 


Each variable was scored separately for each pa- 
tient and each therapist for each of the four time 
sections of each interview, Thus, both patient and 
therapist scores could be compared for each variable. 

Reliabilities for scoring feeling and action words 
were obtained by having two judges independently 
score two selected typescripts for each of these two 
categories of words, The judges achieved satisfactory 
reliability, correlating .74 and .73 with each other 
on the feeling words and .75 and .84 on the action 
words. Since the other three variables (total activ- 
ity, number of questions, and number of reinforce- 
ments) were scored by merely counting their 
occurrence in each typescript, no further reliability 
checks were needed. 


Hypotheses 


The general hypothesis was that the therapist and 
the patient each influence each other’s behavior in 
therapy. More specifically, it was hypothesized that 
both the behavior of the therapist and the behavior 
of the patient would be a function of the therapist, 
of the patient, and of the particular patient-therapist 
combination. It was further hypothesized that, for 
therapist behaviors, the therapist effect would be 
greater than the patient effect, whereas, for patient 
behaviors, the patient effect would be greater than 
the therapist effect. No differential predictions were 
made for the five different variables, and no specific 
hypotheses were made relating to the intercorrelations 
of patient and therapist behaviors. 


RESULTS 
Therapist Analyses 


A summary of the results of the analyses 
of variance? on the therapist measures of 
the five variables is presented in Table 2. 


3The analyses of variance employed a model 
whiċh assumed that the three main effects were 


In general, the results indicate that thera- 
pist behavior is determined, to an important 
extent, by the particular therapist, by the 
particular patient, and by the particular 
Patient X Therapist interaction. Two of the 
five variables show statistically significant 
between-therapist effects, four of the five 
show significant between-patient effects, and 
three show significant Therapist X Patient 
interaction effects.* Two of the variables show 
significant between-sections-of-the-interview 


fixed rather than random variables, In this sense, 
the different patients are considered to be experi- 
mental treatments for the therapists and the thera- 
pists to be experimental treatments for the patients. 
The error term for all the F ratios is thus the 
within sum of squares, In this particular type of 
design, generalization beyond the particular samples 
utilized is quite limited. The generality of the results 
will have to be derived from their reproducibility 
with other samples. 

4The Patient X Therapist interaction effects are 
confounded with the order effect, since each patient 
saw each therapist in a different order, Part of the 
effect attributable to order, that is, that part related 
to whether the patient was being seen for the first, 
second, third, or fourth interview may be partialled 
out by calculating a sum of squares between all 
first, second, third, and fourth interviews an 
dividing by the 3 degrees of freedom for this effect- 
When this effect is partialled out, the Patient X 
Therapist interaction for both therapist and patient 
percentages of feeling words and number of rein- 
forcements remains statistically significant, whereas 
the interaction for both therapist and patient per- 
centages of action words does not reach significance. 
These latter results constitute the best approximation 
of the magnitude of the Patient X Therapist inter- 
action which can be made from these data. 
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effects. The number of words the therapist 
speaks consistently rises during the last 
quarter of the interview, whereas the number 
of questions the therapist asks is consistently 
higher in the first half than the last half of 
the interview. None of the therapist analyses 
show significant Patient x Section or Thera- 
pist X Section interaction effects, 

Figure 1, which illustrates the results for 
the average number of therapist reinforce- 
ments (“Mm-hmm’s”) per minute, shows the 
consistent therapist individual differences 
which exist for this variable. Therapists W and 
Y emitted a higher number of reinforcements 
per minute than did Therapists X and Z, 
tegardless of the particular patient with 
whom they were interacting. The fact that 
Patient B elicited the highest number of 
reinforcements from Therapist W, whereas 
she elicited the lowest number from Thera- 
pist Y illustrates the type of result which 
contributes to the significant Therapist X Pa- 
tient interaction effect. 

Figure 2 shows the results for the average 
percentage of therapist feeling words per 
minute. The significant between-patients ef- 


fect is illustrated by the fact that Patient E 
tended to elicit a larger total percentage o 
feeling words than did Patient D, and the 
significant interaction effect is illustrated by 
the fact that Patient A elicited the lowest 
percentage of feeling words from Therapist W 
and the highest percentage from Therapist X, 
whereas Patient B elicited the highest per- 
centage from Therapist W and the lowest 
percentage from Therapist Y. 


Patient Analyses 


A summary of the results of the analyses of 
variance on the patient measures of the five 
variables is presented in Table 3. All five of 
the patient variables show Statistically sig- 
nificant patient effects, three of the variables 
show significant Patient x Therapist inter- 
action effects, and only one of the variables 
shows a significant between-therapist effect. 
Two of the variables show significant between- 
sections-of-the-interview effects. The number 
of words the patient spoke tended to be lower 
during the last quarter of the interview, 
whereas the percentage of patient action 
words tended to be higher during the first 


Fic, 1. Therapist: Average number of reinforcements per minute. 
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Fic. 2: Therapist: Average percentage of feeling words per minute. 


half than the second half of the interview. 
Patients’ total words and percentage of action 
words also both show significant Patient x 
Section-Of-The-Interview interaction effects. 
Figure 3 shows the results of the average 
number of patient reinforcements per minute. 
The significant between-therapists effect is il- 
lustrated by the result that Therapist Y 
elicited fewer reinforcements per minute from 


all four of the patients than did either Thera- 
pist W or X, whereas the significant between- 
patients effect is illustrated by the result that 
Patient B emitted a greater number of rein- 
forcements than Patient A, regardless of which 
therapist was involved in the interactions. 
The significant Patient X Therapist interac- 
tion effect is illustrated by the result that 
Therapist X elicited the greatest number of 


TABLE 3 
Summary OF ANALYSIS OF VARIANCE OF PATIENT VARIABLES 


Number 
Number of Percentage of Percentage of | Number of of rein- 
words feeling action questions forcements 
Source df MS PS) aS ar PUES MSL 
Between patients 3 1706.23 11.17** 4.97 6.54** 21.10 26.00** .23 5.75** 4.97 19.12" 
Between therapists 3 95.37 — 1.47 1.93 1.87 2.31 09 2.25 5.00 19.23 
Between sections 3 1740.50 11.40** 1.47 1.93 10.10 12.47** 07 1.75 53 2.04 
Patients X Therapists 9 295.29 1.93 1.86 2.45* 400 494 02 — 97 313 
Patients X Sections 9 39700 2.60* 64 — 6.44 7.95** 04 1,00 34 1.31 
Therapists X Sections 9 284.14 1.86 00 — 1.81 2.23 03. — .05 Ear, 
Residual 27 152.70 76 81 04 26 
Total 63 
05. 
=p $o 
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Fic. 3. Patient: Average number of reinforcements per minute. 


reinforcements from Patient C, whereas 
Therapist Z elicited the least number of 
reinforcements from Patient C. 

Figure 4 shows the results for the average 
percentage of patient feeling words per min- 
ute. The significant between-patient effect is 
illustrated by the fact that Patient B always 
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emitted a higher percentage of feeling words 
than Patient A and the fact that Patient C 
always emitted a higher percentage of feeling 
words than Patient D. The Patient x Thera- 
pist interaction effect is illustrated by the 
fact that Therapists W and Z elicited the 
highest percentage of feeling words from 


4. Patient: Average percentage of feeling words per minute. 
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half than the second half of the interview. all four of the patients than did either Thera- 
Patients’ total words and percentage of action pist W or X, whereas the significant between- 
words also both show significant Patient x patients effect is illustrated by the result that 
Section-Of-The-Interview interaction effects. Patient B emitted a greater number of rein- 
| Figure 3 shows the results of the average forcements than Patient A, regardless of which 
number of patient reinforcements per minute. therapist was involved in the interactions. 
The significant between-therapists effect is il- The significant Patient X Therapist interac- 
lustrated by the result that Therapist Y tion effect is illustrated by the result that 
elicited fewer reinforcements per minute from Therapist X elicited the greatest number of 


TABLE 3 
SUMMARY OF ANALYSIS OF VARIANCE OF PATIENT VARIABLES 


Number 
Number of Percentage of Percentage of | Number of of rein- 
words feeling action questions forcements 
Source df MS F MS F MS F MS F MS F 
Between patients 3 1706.23 11.17** 497 654** 21.10 26.00% .23 5,75** 497 19.12"* 
Between therapists 3 95.37 — 1.47 193 1.87 231 09 2.25 5.00 19.23** 
Between sections 3 1740.50 11.40** 1.47 193 10.10 12.47** 07 1.75 53 204 
Patients X Therapists 9 295.29 1.93 1.86 2,45* 400 494** 02 — 97 3173" 
Patients X Sections 9 397.00 2.60* 64 — 644 7.95** 04 1,00 34 131 
Therapists X Sections 9 284.14 1.86 00 — 1.81 2.23 %3 — a Ea 
Residual 27 152.70 76 81 04 -26 
Total 63 
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reinforcements from Patient C, whereas 
Therapist Z elicited the least number of 
reinforcements from Patient C. 

Figure 4 shows the results for the average 
percentage of patient feeling words per min- 
ute. The significant between-patient effect is 
illustrated by the fact that Patient B always 
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Fic. 4. Patient: Average 


emitted a higher percentage of feeling words 
than Patient A and the fact that Patient C 
always emitted a higher percentage of feeling 
words than Patient D. The Patient X Thera- 
pist interaction effect is illustrated by the 
fact that Therapists W and Z elicited the 
highest percentage of feeling words from 


percentage of feeling words per minute. 
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TABLE 4 


INTERCORRELATIONS OF PATIENT AND THERAPIST VARIABLES 


Patient variables 


Number of 
reinforce- Numberof Percentage Percentage of Number of 
Therapist variables ments words of feeling action questions 
Number of reinforcements Ad = AL 14 07 05 
Number of words —.17 —.14 17 —.33 — 14 
Percentage of feeling .52* .49* ol —.67** 19 
Percentage of action —.52* —AT —A40 54" —.18 
Number of questions ll 39 34 —.71 08 
*p <.05; 1 =.48, 
+p <.0l; r =.61, 


Patient B, whereas Therapist X elicited the 
next to lowest percentage from the same 
patient. 

Table 4 shows the intercorrelations between 
each of the patient and therapist measures 
of the five variables. Most of the correlations 
are low to moderate. However, some are sta- 
tistically significant; for example, there are 
significant positive correlations between the 
percentage of feeling words emitted by the 
therapist and the number of words, the per- 
centage of feeling words, and the number of 
reinforcements emitted by the patient. Also, 
there are significant negative correlations be- 
tween the percentage of action words emitted 
by the patient and both the percentage of 
feeling words and the number of questions 
emitted by the therapist. It is interesting to 
note that therapist reinforcements do not 
correlate significantly with any patient vari- 
ables, whereas patient reinforcements cor- 
relate significantly and positively with thera- 
pist feeling and significantly and negatively 
with therapist action, 


Discussion 


The results empirically demonstrate the 
necessity of conceptualizing the patient and 
therapist as an interacting system in which 
both individuals mutually influence each 
other, The extent of influence attributable to 
each of the three major sources (patient, 
therapist, and Patient x Therapist interac- 
tion) varies considerably depending upon 
which variable was analyzed and whether that 
variable was measuring a patient or a thera- 


pist behavior. In general, however, all three 
sources do contribute substantially to both 
patient and therapist behavior. 

Consistent individual differences among 
therapists were shown for only one variable 
(number of “Mm-hmm’s”), and, for this vari- 
able only, the results provide evidence for a 
“behavioral trait” on which therapists differ 
consistently from each other. For the other 
four behaviors, however, therapists did not 
show any very large consistent individual dif- 
ferences across patients. Perhaps the charac- 
terization of the therapist as a “reinforcing 
machine” (Krasner, 1962) is accurate, at 
least to the extent that therapists are “pro- 
grammed” to emit a certain number of rein- 
forcements per session, regardless of the pa- 
tient they happen to be seeing. However, 
therapists’ reinforcements did not seem to be 
related to an increase in any of the patient 
behaviors measured in this study. The pa- 
tients appeared to be more efficient in elicit- 
ing feeling words from the therapists than 
the therapists were in eliciting feeling words 
from the patients. In this specific sense, the 
patients, whose levels of reinforcement were 
importantly related to the therapists (41.176 
of the variance in patient reinforcement was 
related to consistent differences between the 
eliciting therapists), do not appear to be 


5 The percentages of variance accounted for by 
each source of variance were calculated for the 
fixed-effects  analysis-of-variance model utilizing 
the rationale and equations given by Gleser, Cron- 
bach, and Rajaratnam (1965), Endler (1966), and 
Endler and Hunt (1966). These authors have 
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“reinforcing machines.” The results suggest 
that the patient may be an even more sensi- 
tive selective reinforcer than the therapist. 

The results imply that most of these thera- 
pists’ behaviors in therapy are not the result 
of either a “behavior trait” or of a consist- 
ently applied “therapeutic technique,” but, 
rather, are important situationally or patient 
determined. Take, for example, the variable 
of “talkativeness.” This variable is generally 
considered to be a personality trait, and, to 
the extent that it is conceptualized in this 
fashion, it should show consistent (across 
settings) individual differences, that is, an 
individual who is relatively talkative in one 
setting should also be relatively talkative in 
a variety of other settings. If this were not 
the case, the status of “talkativeness” as a 
personality trait would be open to some 
doubt. When individual differences are specific 
to certain settings, then there is little evi- 
dence for personality traits, which imply 
consistent individual difference across settings. 

Let us assume that “average number of 
words spoken” is a measure of “talkative- 
ness.” What is the evidence then, from these 
data, that these therapists have a consistent 
personality trait of “talkativeness”? Con- 
sistent differences between therapists on this 
variable, even though statistically significant, 
contributed only 15.2% of the total variance, 
whereas consistent differences between pa- 
tients in the amount of “talkativeness” they 
elicited from therapists contributed 18.5% 
of the total variance. These results indicate 
that the interpersonal setting the therapist 
was in (i.e. which patient he was talking 
with) contributed as much to the amount he 
talked as did his consistent “trait” of 
“talkativeness.” This suggests that these 
therapists may differ consistently from each 
other in the amount they talk; however, they 
also consistently talk to some patients more 
than they do to others. 


Pointed out that the analysis-of-variance technique 
can be used to estimate the relative magnitude of 
each individual component of variance, expressed as 
percentage of the sum of the different variance 
Components, The general logic of the technique in- 
Volves breaking the expected mean squares into their 
various variance components and solving separately 
for each component. 
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These general conclusions hold to an even 
greater extent for the three remaining thera- 
pist variables—percentage of feeling words, 
percentage of action words, and number of 
questions asked. For these therapist variables, 
the patient to whom the therapist is talking 
also contributes a good deal more of the 
variance than does the therapist who is doing 
the talking. To the extent that these vari- 
ables measure something related to therapist 
technique, it appears reasonable to conclude 
that a major determinant of these therapists’ 
techniques may be the particular patient who 
is eliciting that technique. 

There are some important empirical results 
from other studies which bear on this ques- 
tion of therapist consistency, For example, 
Rottschafer and Renzaglia (1962) attempted 
to classify counselors into “reflective” and 
“leading” on the basis of initial interviews 
with clients other than the clients utilized in 
their study proper. Using this classification 
of counselor styles they found, for the clients 
actually seen in the study, no relationship be- 
tween counselor style and client-dependency 
statements. Further analysis indicated that 
six of the eight counselors were inconsistent 
in style across interviews with different cli- 
ents. When counselors were reevaluated for 
style on each contact, it was found that there 
was not the dichotomy of leading versus re- 
flective counselors, as previously supposed. 
The counselors used a mixture of two styles, 
apparently depending on the particular client 
they were seeing. The implication is clear that 
attempting to predict patient behavior (e.g., 
number of dependency statements) from a 
general classification of therapist “style” is a 
very risky business, especially since the thera- 
pist may be using a somewhat different style 
with each patient. ae 

Ellsworth (1963) investigated beginning 
counselors’ degree of consistency of feeling- 
verbalization between their counseling inter- 
views and the case conferences in which they 
participated. Each counselor was ranked ac- 
cording to the degree of feeling verbalized 
in the two situations, and Ellsworth found a 
correlation of .38 between the weighted rank- 
ings of these two situations. He concluded 
that the degree of feeling verbalization a 
counselor shows in nonclient relationships is 
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significantly consistent with the degree of 
feeling verbalization evidenced in his client 
relationships. It should be noted, however, 
that this significant correlation accounts for 
only about 15% of the variance in feeling 
verbalization. Even this relatively small 
percentage is more than the 1.1% of the 
variance in feeling verbalization which 
was accounted for by consistent individual 
differences between therapists in this study. 
Notwithstanding Ellsworth’s conclusions con- 
cerning consistency, it is possible that predic- 
tions concerning amount of feeling verbaliza- 
tion might be more accurately based on the 
setting the counselor happened to be in than 
on the consistent “trait” of the counselor. 

Truax (1966), in an intriguing recent 
study, took excerpts from tape recordings of 
a single, long-term, successful therapy case 
handled by Rogers. The excerpts were ana- 
lyzed to evaluate the adequacy of the client- 
centered view that empathy, warmth, and 
directiveness are offered throughout therapy 
in a manner not contingent on the patient’s 
behavior. Rogers has held that these “condi- 
tions” are primarily attitudinal in nature and 
are offered in a nonselective fashion to the 
patient: They are specifically not contingent 
upon the patient’s verbalizations or behaviors. 
That is, Rogers conceptualizes these variables 
as consistent “traits” of the therapist. Truax 
found that the therapist (Rogers) responded 
in a significantly differential way to five of 
the nine patient behavior classes studied. If 
there is differential responding in empathy 
and warmth to different content with one 
patient, it is but a small step to suggest that 
there is also differential responding in em- 
pathy and warmth to different patients. This 
latter result is, of course, exactly what Van 
der Veen found. In his study, approximately 
45% of the variance in “Accurate Empathy” 
was related to the therapist alone; the other 
55% was related to the patient, to the inter- 
view, to the particular section of the inter- 
view, and to the various first-, second-, and 
third-order interactions. 

In general, then, the therapist modifies and 
adapts his behavior considerably, depending 
upon the characteristics of the particular pa- 
tient he is seeing. Some important directions 
for future research should include investiga- 
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tions of the relative degrees of consistency of 
beginning and experienced therapists, and of 
the “consistency” of a therapist’s behavior 
across his sessions with one patient in relation 
to his “flexibility” in behavior across sessions 
with different patients. 

The results for patient behavior indicated 
that the patients generally show a greater 
degree of consistent individual behavior 
across settings than do the therapists, or, 
conversely, the patients show less adaptabil- 
ity and change as they move from one rela- 
tionship to another. From these results, it 
appears that therapists modify their behavior 
with different patients to a much greater 
degree than patients modify their behavior 
with different -therapists. These results raise 
the novel hypothesis that therapists might 
be more open to change than patients during 
psychotherapy. 

If the patient does not modify his behavior 
very much in interacting with different thera- 
pists, then the particular therapist the patient 
is assigned to may not make as much differ- 
ence as has been thought to be the case. In 
this connection, it should be noted that in 
Van der Veen’s study, approximately 35% of 
the variance in “problem expression” and 
“experiencing,” both patient variables, was 
accounted for by individual differences be- 
tween patients, whereas individual differences 
between therapists accounted for only about 
5% of the variance in these variables. 

With respect to the intercorrelations among 
the variables, it is important to note that 
therapist reinforcements were essentially un- 
correlated with all five patient variables, 
whereas patient reinforcements were signifi- 
cantly correlated both with therapists’ per- 
centage of feeling and percentage of action. 
It is possible that the patients were consist- 
ently positively reinforcing therapist feeling 
responses. Evidence that patients attempt to 
modify therapists’ behavior also comes from 
a study by Clemes and D’Andrea (1965), 
who found that therapists experienced more 
difficulty in giving an interview incongruent 
with patient expectations than an interview 
compatible with patient expectations. This 
was so even though therapists were not in- 
formed about patient expectations. The au- 
thors implied that patients were attempting 
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to change the therapist’s behavior to con- 
form to their own expectations. These results 
suggest the importance of investigating what 
responses patients tend to reinforce in their 
therapists, especially since therapists seem 
more able to vary their behavior across pa- 
tients than patients are to vary their behavior 
across therapists, at least for certain therapy 
behaviors. The mutual influence which occurs 
in the Patient X Therapist interaction is well 
illustrated by the significant positive correla- 
tion between therapist and patient feeling 
responses. It is likely that both the therapist 
and the patient mutually reinforce each 
other’s emphasis on feelings. 

If a mutual influence process exists, then 
the usual verbal-conditioning paradigm is not 
at all analogous to what goes on in psycho- 
therapy. In verbal-conditioning studies an 
experimenter has decided beforehand exactly 
which subject responses he will reinforce ac- 
cording to a specified schedule, that is, the 
experimenter does not allow his behavior to 
be modified by the subject except according 
to prearranged specifications. The influence 
of the “patient” and the “therapist” is seen 
as a one-way process, whereas these results 
suggest that, to a very significant and impor- 
tant degree, the influence process between the 
teal patient and the real therapist is a mutual 
one and, perhaps in more cases than we 
should like to admit, a situation in which the 
patient has a relatively greater influence on 
the therapist than vice versa. 

The generality of these results across dif- 
ferent samples of patients, therapists, and 
behaviors must be determined by replications 
using factorial experimental designs. The 
weight of the evidence thus far strongly sug- 
gests that the finding that the patient, the 
therapist, and the particular Patient X Thera- 
pist interaction each account for an impor- 
tant amount of behavioral variance will be 
replicated, as indicated, for example, by the 
Similarity of Van der Veen’s results, even 
though he utilized different samples of 
Patients, therapists, and variables. : f 

It is probable that behavioral consistencies 
vary importantly depending upon the particu- 
ar patient, the particular therapist, and the 
Particular behavior under consideration. Fur- 
ther research directed at delineating these 
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relative consistencies should enhance con- 
siderably the predictability of various types 
of therapeutic change in specified Patient X 
Therapist interactions. In any case, these 
data strongly indicate that the patient and 
therapist must be studied as a mutual 
influence system. 
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REPRESSION-SENSITIZATION AND ITS RELATION 
TO MEASURES OF ADJUSTMENT AND CONFLICT 


VINCENT J. TEMPONE anp WESLEY LAMB 1 


University of Arizona 


The question posed was whether the 


relationship between repression-sensitiza- 


tion (R-S) and adjustment and conflict was linear or curvilinear. An analysis 


of R-S protocols of 175 patients at 


a mental health center and 459 non- 


clinical Ss indicated that the clinical Ss were more sensitized. In a 2nd 


study, conflict was measured by the 


Incomplete Sentence Blank (ISB) and 


an attitude questionnaire. The R-S and ISB scores of 58 clinical outpatients 
were highly correlated (r=.73, p < .01). The R-S scores and conflict scores 
obtained from the attitude questionnaire of 95 introductory psychology stu- 


dents correlated r = .32, p < .01. The 
the hypothesis that the relationship bı 
is linear. 


When first published the Repression-Sensi- 
tization (R-S) scale (Byrne, 1961) was con- 
ceptualized as a measure of defensiveness, 
with individuals scoring at one extreme uti- 
lizing defenses of avoidance, denial, and re- 
pression, while individuals scoring at the 
sensitization end used defenses of approach, 
obsession, or intellectualization of anxiety- 
arousing stimuli. Since these are two defensive 
modes of responding, one would expect an 
inverted U-shaped relationship between R-S 
and measures of adjustment. Individuals scor- 
ing at the extremes on the R-S scale would 
be equally maladjusted but utilizing different 
defense mechanisms, while those scoring in 
the middle of the scale would be adjusted. 

However, a number of recent studies would 
Suggest that the relationship between R-S 
and adjustment is linear rather than curvi- 
linear. A number of investigators have found 
that sensitization is related to the following 
variables: the number of deviant response 
tendencies on an adjective check list (Byrne, 
1961; Lucky & Grigg, 1964); anxiety as 
Measured by the Taylor Manifest Anxiety 
(MA) scale (Joy, 1963); psychological mal- 
adjustment as reflected in a number of 
MMPI scales (Joy, 1963); deviations from 
normality as measured by the California Psy- 


1The authors wish to thank both Roland Tharp, 
Director, Psychology Department, Southern Arizona 
Mental Health Clinic, Tucson, Arizona, and his staff 
for their assistance in the collection of the data, and 


eee Stropko for his assistance in analyzing the 
‘ata, 


results of both studies tended to support 
tween R-S and adjustment and conflict 


chological Inventory (Byrne, Golightly, & 
Sheffield, 1965; Joy, 1963); and self-ideal 
discrepancy (Byrne, 1961; Byrne, Barry, & 
Nelson, 1963). As Byrne (1965) noted, the 
difficulty with most of these studies was that 
the measures of adjustment were paper-and- 
pencil tests, and since by definition repres- 
sion is a denial that anything is wrong, then 
it is questionable whether these instruments 
accurately reflect the emotional adjustment 
of repressers. The problem was compounded 
further by the fact that frequently there was 
a high degree of item overlap between the 
R-S scale and these paper-and-pencil meas- 
ures of adjustment. 

The purpose of this investigation, consist- 
ing of two studies, was to gather additional 
data in an attempt to clarify the relationship 
between R-S and adjustment, using measures 
of adjustment that avoided some of the pit- 
falls of the earlier measures. In the first study, 
adjustment was defined in terms of whether 
or not an individual was seeking outpatient 
services at a mental health clinic. If the 
relationship between R-S and adjustment is a 
curvilinear one, the R-S scores among clinic 
patients should form a bimodal distribution. 
Tf the relationship between R-S and adjust- 
ment is linear, one would expect the clinic 
population to have a highly skewed distribu- 
tion with most of the scores falling toward 
the sensitization end of the scale. Based upon 
the earlier studies cited above, a linear rela- 


tionship was predicted. 
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In the second study, two correlational stud- 
ies were conducted in which R-S scores were 
correlated with conflict scores. Using a clini- 
cal population, conflict was measured by a 
semiprojective instrument, the Incomplete 
Sentence Blank (ISB). The ISB was selected 
as a measure of conflict, since it was independ- 
ent of the R-S scale in terms of item content. 

In the second correlational study, using a 
college population, the R-S scale was corre- 
lated with conflict as measured by an attitude 
questionnaire, Some recent studies have found 
that female sensitizers perceived their moth- 
ers as inconsistent (Byrne, 1964) and both 
male and female sensitizers reported they en- 
joyed behaviors which they believed to be 
morally wrong and likely to have unpleasant 
environmental consequences (Byrne, 1965); 
therefore, it would be expected that sensi- 
tizers are inconsistent, conflicted individuals. 
Tn filling out an attitude questionnaire, one 
would expect sensitizers to be inconsistent 
and to agree with items that were contradic- 
tory. It was predicted that R-S would be 
positively related to conflict as measured by 
the ISB and the attitude questionnaire. 


METHOD 


In the first study, two sets of R-S protocols were 
collected over approximately a 2-year period. The 
first set of 459 protocols was from students in 
lower and upper division psychology classes. The 
second set was obtained from 175 outpatients at a 
mental health clinic. The clinic patients were ad- 
ministered the scale as part of a normal intake 
procedure and were tested in groups of 4 to 10. 

In the second study, the ISB Adult Form, along 
with the R-S scale, was administered to 58 out- 
patients at a mental health clinic. The ISB was 
scored for conflict using the scoring manual (Rotter 
& Rafferty, 1950). It should be noted this scoring 
manual was standardized on a college population, 
but it is common practice to use it to score the 
adult form of the ISB. Only four sentence stems 
differ in the two forms. The stems that do. differ 
still deal with the same areas of the subject’s life. 
The interjudge reliability coefficient (r=.97) indi- 
cated that the judges had little difficulty applying 
the college norms to the sample used in this study. 

As a second measure of conflict, an attitude ques- 
tionnaire was developed by the second author. The 
questionnaire consisted of logically incongruent 
items (ie., “I usually accept the blame when some- 
thing goes wrong,” vs. “When things go wrong, I 
can usually trace the matter to something someone 
else has done”). Conflict was defined as agreeing with 
two contradictory statements. In this study, conflict 
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was similar to McReynolds’ (1958) concept of in- 
congruency. According to McReynolds’ theory, anxi- 
ety results when percepts are unassimilated into the 
perceptual system. Unassimilated or incongruent per- 
cepts are those that are difficult to integrate be- 
cause they are inconsistent, contradictory, or dis- 
sonant. 

Four judges (psychology faculty and graduate 
students) were asked to select paired items that were 
logically incongruent. There was 100% agreement on 
35 items, and these made up the attitude question- 
naire. These items, along with buffer items taken 
from the MMPI, made up the scale. The scale was 
called the Biographical Inventory.2 The members of 
each incongruent pair were separated by at least 30 
items. The inventory and the R-S scale were ad- 
ministered to 95 introductory psychology students. 

The revised R-S scale that was derived from an 
item analysis of the original scale (Byrne et al, 
1963) was used in both studies, 


RESULTS 


To test the hypothesis that R-S is related 
to adjustment in a linear fashion, the R-S 
scores of the clinic group were compared with 
those of the college population. A 2 X 2, Sex 
X Population, analysis of variance was per- 
formed. The difference between the two sam- 
ple means was significant (F = 137.15, df= 
1/630, p < .001), with the clinical group 
scoring significantly higher or toward the 
sensitizing end of the scale when compared 
with the college group. 

Table 1 shows the means, medians, and 
standard deviations of the groups according 
to sex. Although both sexes in the clinical 
group scored more toward the sensitization 
end of the scale, females were much more 
sensitized than males. The interaction effect 
of Sex X Population was significant (F= 
11.10, df = 1/630, p < .01). 

Using the college sample as a theoretical 
or expected distribution, 140 clinical outpa- 
tients, or 77%, scored above the median of 
the college population (x? = 63.0; df= 1, ? 
< .01). 

A significant difference between the means 
of the college and clinical groups is not u0- 
equivocal evidence that the relationship is not 
curvilinear. Since maladjustment as defined in 
this study is a dichotomous variable (i.e, 
seeking or not seeking treatment at a mental 
health center), then any relationship between 


2A copy of the Biographical Inventory may be 
obtained by writing to the senior author. 
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TABLE 1 


MEANS, MEDIANS, AND STANDARD DEVIATIONS OF COLLEGE AND CLINICAL SAMPLES ON R-S SCALE 


Population 
College Clinical 
Male Female Total Male Female Total 
N 165 294 459 64 111 175 
Mdn 34.17 34.28 35.01 52.66 63.00 61.75 
x 37.03 36.60 36.75 50.44 62.54 58.11 
SD 19.01 19.30 19.18 26.03 22.24 24.34 


the means must be linear. However, if one 
examines the two distributions one finds that 
the college sample is positively skewed, or 
toward the sensitization end, while the clini- 
cal population is highly skewed toward the 
repression end. Inspection of Table 1 shows 
that the medians are lower than the means in 
the college sample while the reverse is true in 
the clinical sample. This reflects the skewness 
of the two distributions. Although this is not 
the most direct evidence against the curvi- 
linear hypothesis, it does suggest that in terms 
of frequency there are many more sensitizers 
in the clinical population while the reverse is 
true of the college population. 

Since the clinical group differed from the 
college group on many variables such as age, 
socioeconomic class, etc. it is conceivable 
that one of these other variables may be re- 
sponsible for this observed difference. In an 
effort to rule out this possibility, a subset of 
clinical patients who were enrolled as college 
students was selected. Of the 175 clinical pa- 
tients, 30 were presently enrolled as full-time 
college students, This subsample approxi- 
mated the college population with respect to 
age and sex distributions. The means of these 
two clinical groups, college (X = 57.60) and 
noncollege (X = 58.22), did not differ signifi- 
cantly (F = .016, df = 1/173, ms). There- 
fore, it seems unlikely that the difference in 
R-S scores between the clinic and college 
groups is the result of some variable other 
than degree of maladjustment. i 

A more direct approach to the curvilinear 
hypothesis is to examine the relationship be- 
tween R-S and conflict as measured by the 
ISB and the attitude questionnaire. When this 


is done it is found that the correlation co- 
efficient between R-S and the ISB is r = .73, 
p <.01. The correlation ratio for the regres- 
sion of conflict scores on R-S is »=.79, p 
< 01. 

Since any given sample would show devia- 
tion from linearity, it would be expected that 
7 would exceed r. The critical question is how 
much must y exceed 7 before one suspects 
curvilinearity. To answer this question, the 
test for linearity of regression (McNemar, 
1957) was applied. This test yields an F = 
1.54, df = 8/48, ns. Applying the same tests 
to the relationship between R-S and conflict 
as measured by the attitude questionnaire, 
one finds r= .32, p < .01; and y= 39, p< 
01. Again the test for linearity of regression 
is nonsignificant (F = .63, df = 8/85, ns). It 
should be noted that a nonsignificant F in 
the test for linearity of regression does not 
prove linearity. The only conclusion that can 
be drawn is that the departure of the array 
means from a straight line is not great enough 
to warrant the conclusion that the relation- 
ship is curvilinear. 

Since the conflict score was derived from 
sensitizers responding in the affiramtive to 
two contradictory statements, it might be 
argued that rather than measuring conflict 
per se the test was measuring acquiescence or 
the tendency merely to respond in the affirma- 
tive (Chapman & Campbell, 1957). As a 
check against this possibility, the total num- 
ber of “yes” responses to the buffer items on 
the Biographical Inventory of individuals 
scoring in the top and bottom quartiles on the 
R-S scale were compared using an analysis of 
variance. The analysis resulted in an F = 98, 
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df = 1/46, ns. It seems unlikely, then, that 
conflict as defined here is merely a tendency 
to give a “yes” response. 


Discussion 


Since by definition repressors tend to deny 
conflict and do not admit to symptoms, it 
may be argued that one could expect little 
else than a positive correlation between the 
R-S scale and the ISB. This argument as- 
sumes that repressors are actually very dis- 
turbed individuals, but the ISB is insensitive 
to this type of disturbance. Another assump- 
tion underlying this argument is that the ISB 
is an insensitive measure of adjustment. Yet 
in four independent studies (Chance, 1958; 
Churchill & Crandall, 1955; Rotter & Raf- 
ferty, 1954; Rotter & Willermann, 1947) each 
with quite diverse populations and different 
measures of adjustment, investigators have 
reported a significant negative correlation be- 
tween the conflict scores on the ISB and ad- 
justment. In addition, as noted earlier, when 
other measures of adjustment are used such 
as the California Psychological Inventory 
(Byrne et al., 1965) repressors appear more 
adjusted than either sensitizers or those scor- 
ing near the median on the scale. 

Another criticism that can be raised is that 
repressors are unlikely to be found in clinic 
population. As already noted, since repres- 
sion is a denial or avoidance of emotionally 
laden or anxiety-arousing stimuli, then re- 
pressors are likely to deny their own emo- 
tional states and not seek treatment in a 
clinic. Continuing this line of reasoning, one 
would expect repressors to avoid a clinical 
setting entirely or to seek treatment only in 
the later stages of an emotional disorder 
when the stress becomes so severe that the 
usual repressive defenses fail. If repressors, 
because of the nature of their defenses, are 
likely to avoid a clinical setting, then one 
must turn to nonclinical populations to exam- 
ine repressors. In the populations examined to 
date, mainly college populations, one finds 
that repressors are less incongruent, less devi- 
ant in self-ideal ratings, have better adjusted 
profiles on the MMPI and the CPI, and in 
general appear no different from normal or 
adjusted individuals. In short, then, repress- 
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ors are individuals who appear well-adjusted 
on personality inventories and are rarely 
found in a psychiatric population. It should 
be noted that this is the usual operational 
definition of a “normal” or well-adjusted in- 
dividual. 

The criticism may still be raised that the 
above criteria of adjustment are not adequate 
when applied to repressors. Persons using re- 
pressive defenses may be so well defended 
that they are likely to appear normal on per- 
sonality inventories and are likely to avoid 
psychiatric treatment. Even if this interpre- 
tation were correct, one would still expect 
these defenses to be only partially effective. 
At some point, stress would become so severe 
that the usual repressive defenses would 
break down and a severe psychological dis- 
turbance would follow. If this were so, one 
would expect that repressors would more 
likely be found among severely disturbed in- 
dividuals. 

Since the clinic used in this study main- 
tained a day hospital, the patients seen cov- 
ered the psychiatric diagnostic spectrum from 
situational anxiety reaction to chronic schizo- 
phrenia. The admitting psychiatric diagnosis 
which included a description of presenting 
symptoms was used to classify patients into 
three broad categories. The categories were 
defined in terms of whether the symptoms 
were the result of (a) present environmental 
stress, (b) a basic neurotic process, or (¢) 4 
psychotic condition. Two judges independ- 
ently sorted the cases into these three cate- 
gories. There was perfect agreement between 
the judges on 82% of the cases. These cate- 
gories were unrelated to R-S. Further, re- 
pressors were found in the same relative pro- 
portion in each of these categories as they 
were in the clinical population as a whole. 
However, this does not answer the question 
entirely, since one can always question the 
adequacy of the psychiatric diagnosis. In ad- 
dition, the day hospital population might not 
be an adequate sample of severely disturbed 
individuals. 

Another issue that may be raised concerns 
the construct validity of the R-S scale. If 
further research confirms the hypothesis that 
R-S is linearly related to adjustment, one may 
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question the feasibility of using the term re- 
pression-sensitization. In addition, since R-S 
is highly correlated with both social desira- 
bility and the Taylor MA scale, one may ques- 
tion whether the scale is measuring another 
variable, or is merely another operational defi- 
nition of another construct. As Byrne (1965) 
has asked, “Would it not be wise to use non- 
sense syllables or zip codes as the names of 
personality variables and thus side-step the 
problem” [p. 209]? But as both Byrne 
(1965) and Farber (1964) have noted, the 
names that we attach to variables influence 
the type of research conducted. As Farber fur- 
ther pointed out, the Taylor MA scale and 
the social desirability scale may be measuring 
the same thing, The fact that they actually 
correlate —.84 would tend to support this as- 
sertion. Yet Farber goes on to note that the 
usefulness of the scale has been its supposed 
relevance as a measure of drive in Hull-Spence 
theory. Within this theoretical context, a 
series of studies has been conducted relat- 
ing individual differences in drive (as defined 
by scores on the Taylor MA scale) to such 
learning phenomena as eyelid conditioning 
and paired-associate learning. It is unlikely 
that the label “social desirability” would have 
generated such research. 

It may be that R-S is a misnomer and a 
better term would be tendency toward sensi- 
tization, or perhaps the construct should be 
dropped entirely and a term such as anxiety 
or social desirability substituted. However, 
at this juncture such action seems premature, 
The construct of repression-sensitization and 
the nomothetic network surrounding this con- 
struct have led investigators to conduct a 
number of studies that probably would not 
have been conducted had another label been 
Substituted. 

The empirical finds relating R-S to differ- 
ences in perceptual thresholds (Tempone, 
1964) or reactions to sexual stories (Byrne & 
Sheffield, 1965) may or may not have been 
conducted had the scale been called by an- 
other name, One could also argue that a 
number of fruitless studies have been con- 
ducted because of the surplus meaning at- 
tached to this construct. Thus the ultimate 
Utility of this construct will not be estab- 
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lished by debate but by future empirical re- 
search. 

In summary, it should be noted that on the 
bases of the statistical tests used in this 
study, one cannot conclude definitely that R-S 
is linearly related to adjustment or conflict. 
All that can be concluded is that the de- 
parture from linearity is not great enough to 
suggest the relationship is curvilinear. Fur- 
thermore, since the curvilinear hypothesis 
assumes that the relationship is an inverted 
U-shaped one, the probability of this type of 
relationship seems quite low. When the re- 
sults of this study are taken in conjunction 
with the other studies cited in the introduc- 
tion, the tenability of the curvilinear hypoth- 
esis is highly questionable. 
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RELATIONSHIP BETWEEN 


PSYCHOTHERAPY WITH 


INSTITUTIONALIZED BOYS AND SUBSEQUENT 


COMMUNITY 
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A 1-year follow-up study evaluated 
quent boys who, while incarcerated, 


ADJUSTMENT* 


PERSONS 


e University 


the community adjustment of 41 delin- 
had each participated in 40 group and 


20 individual therapy interviews, and 41 matched control delinquents. The 
therapy group had a low rate of recidivism compared with the control group 
and the institutional base rates. The therapy group committed fewer offenses, 
broke parole less, and had a greater percentage of boys employed for a longer 
period of time. While still incarcerated, 30 of the 41 therapy boys were 


judged to have been successfully treat 


ed, and these boys subsequently made a 


significantly better community adjustment in all spheres than any other group. 


The results of a number of psychotherapy 
studies with social deviants have suggested 
that therapy may be a fruitful rehabilitative 
enterprise (Persons, 1965a, 1965b; Shore, 
Massimo, & Mack, 1965; Shore, Massimo, & 
Ricks, 1965). There is, however, a lack of 
controlled follow-up studies, using well- 
matched subjects, which investigate delin- 
quents who received psychotherapy while in- 
carcerated. The recent studies that have 
yielded positive findings have been concerned 
with noninstitutionalized delinquents’ improv- 
ing their overt community behavior and 
incarcerated subjects’ improving overt behav- 
ior within the institution and becoming better 
adjusted as measured by psychological tests. 
Illustrative of this type of study, Persons 
(1966) made an institutional evaluation of 
82 incarcerated delinquent boys, 41 of whom 
received 20 weeks of therapy and 41 of whom 
received no therapy. The results indicated 
that the therapy group showed a superior 
adjustment as measured by psychological 
tests and a number of measures of overt 
behavior. The present study is a community 
follow-up report on the 82 boys 1 year after 
the termination of therapy. 

In the initial study (Persons, 1966), 
41 pairs of incarcerated delinquents were pre- 

1 The author wishes to express appreciation to the 
Ohio Youth Commission, I. Warrick, and Donald 
Mosher for assistance in this project. Portions of 
this paper were presented at the annual convention 
of the American Psychological Association held in 
New York City, September, 1966. 


matched on a number of background vari- 
ables; the selection process was designed with 
the objective of matching the therapy and 
control groups, subject for subject, on as 
many variables as possible. Each pair was 
selected so as to match as nearly as possible 
on the following variables: age, intelligence, 
race, socioeconomic background, type of of- 
fense, number of previous offenses, total time 
incarcerated during life, and nature of insti- 
tutional adjustment. One member from each 
pair was randomly assigned to a therapy 
group and the other to a control group. The 
boys’ socioeconomic backgrounds were esti- 
mated to be predominantly upper-lower and 
lower-middle class. There were 8 Negroes and 
33 whites in each group. The mean age and 
intelligence quotient of the treated boys were 
16.4 and 99.2, respectively, while the mean 
age and IQ for the controls were 16.3 and 
97.6. The mean number of officially regis- 
tered offenses prior to the current institu- 
tionalization was approximately four for each 
group, and the mean total time incarcerated 
for each group was approximately 11 months, 
The two most typical offenses for which the 
boys had been committed were auto theft and 
breaking and entering, but there were several 
boys in each group who had committed more 
serious crimes. The boys in both groups were 
serving indeterminate sentences, and in each 
group their institutional adjustments pre- 
ceding treatment ranged from very good to 


very poor. 
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Each of the 41 boys in the therapy group 
came twice weekly for group psychotherapy, 
which met for an hour and a half per session. 
In addition, each boy had an average of 1 
hour a week of individual psychotherapy. 
Thus, considering both individual and group 
therapy, each boy had 80 hours and 60 ses- 
sions of therapy over a 20-week period. 
There were five psychotherapists conducting 
interviews with six different groups of boys. 
In every case a boy had the same group and 
individual therapist. Throughout the 20 weeks 
the control group boys participated in the 
regular institutional program, but received no 
therapy. Following treatment, 30 of the 41 
therapy boys demonstrated less pathological 
test scores, while only 12 controls showed im- 
provement. The therapy boys showed better 
institutional adjustment, better interpersonal 
relationships, better performance in the insti- 
tutional school, had fewer disciplinary re- 
ports, and received their institutional passes 
sooner than did the boys in the control group. 
Twenty of the 30 successfully treated boys 
became more similar to their particular thera- 
pist as indicated by personality measures, 
vocational interests, scholastic orientation, 
verbal statements of desire to be similar, 
physical appearance, and language habits 
(Persons & Pepinsky, 1966). 

The critical question, of course, is whether 
the successfully treated boys’ superior insti- 
tutional adjustment persisted after release to 
the community. To answer this question, 1 
year after the termination of the therapy 
period the Juvenile Placement Bureau had 
each parole officer submit a standard detailed 


TABLE 1 


COMMUNITY ADJUSTMENT OF THERAPY 
AND CONTROL Groups 


Therap: Control 
(N = 41) (Ñ =41) 
Subjects staying in community 28 16 
Subjects reinstitutionalized 13 25 
X offenses by returnees 1.94 3.07 
Number of parole violators 20 32 
& parole violations 1.75 3.91 
Successes employed 20 6 
Returnees employed 4 5 
Successes’ X time employed 6.2 mo. 3.2 mo. 
Returnees’ X time employed 2.3 mo. 1.9 mo. 


Roy W. PERSONS 


report describing each boy’s activities since 
release from the institution. Each boy had 
meetings with his parole offcer twice a 
month, and the parole officer visited in the 
boy’s home. The parole officers had detailed 
information on all official and unofficial con- 
tacts with all law enforcement agencies, as 
well as employment information. 


RESULTS AND DISCUSSION 


Obviously, all 82 of the boys were not re- 
leased the same day. The amount of time 
from the mean release date to the day the 
data were collected was 9.5 months, and there 
were no significant differences between the 
control and therapy groups concerning the 
time they were released; that is, on the 
average, the boys in both groups were released 
to the community 2.5 months after the 
therapy period ended. 

A comparison of the therapy and control 
groups’ community adjustment is presented 
in Table 1. In Table 1, “success” refers to 
those boys who have not been reinstitution- 
alized, while “returnee” refers to boys who 
have been reinstitutionalized in any penal 
institution. Comparing the number of suc- 
cesses and returnees for both groups indicates 
superior community adaptation on the part 
of the therapy group (chi-square = 7.06, 
~<.01). The modal offense for which the 
therapy boys were reinstitutionalized was 
auto theft, and for the control boys, burglaty 
and auto theft were the most frequently com 
mitted offenses. There was a trend for the 
therapy returnees to commit fewer offenses 
(t=1.55, p<.10) than the control te 
turnees. Analyzing the number of parole vio 
lations (behavior misconduct that does not 
necessarily involve legal infractions, but 
does constitute a violation of the parole Te- 
strictions) does show that the therapy 
boys committed significantly fewer parole 
violations (t= 4.01, p <.001). A smaller 
proportion of therapy than control boys come 
mitted parole violations (z = 2.74, $ < 00 4 

An evaluation made following therapy a 
while the boys still were incarcerated iP ‘1 
cated that 30 boys had shown significant E 
provement. One year later, 25 of these a 
judged successfully treated boys were still 1 
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TABLE 2 


PROPORTION OF JUDGED SUCCESSFULLY TREATED (S) 
Jovcep Nonsuccessrurty Treatep (N), AND 
Controt SUBJECTS (C) STAYING IN 
THE COMMUNITY 


Group Proportion Z. 
N .273 
3. .29**** 
S 833, 
S .833 
4.1384 
Cc 390 
N .273 
69 
C -390 
S .833 
241 
All subjects 536 


a Z test pereen two proportions. 


the community, whereas only 3 of the 11 non- 
successfully treated boys were still in the 
community. As can be seen from Table 2, a z 
test between two proportions indicates that 
this difference is highly significant. Also, the 
judged successfully treated boys were much 
superior to the control group in their ability 
to stay in the community. The therapy boys 
that were judged as unsuccessfully treated 
did not differ from the control group in abil- 
ity to stay in the community. A greater 
proportion of the judged successfully treated 
group remained in the community as com- 
pared with all other groups. Of the 20 suc- 
cessfully treated boys who became more simi- 
lar to their therapist, 18 were still in the 
community 1 year later. Although this finding 
raises interesting questions, generalizations 
about the benefit of converging to the thera- 
pist are not appropriate because 7 of the 10 
nonconverging successfully treated boys were 
also in the community a year later. 
Following Meehl and Rosen’s (1955) base- 
rate suggestion, two different procedures were 
used to arrive at the base rates of recidivism. 
In this context recidivism refers to boys who 
have been released from this particular insti- 
tution and who have been reinstitutionalized 
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in any penal institution. The first base-rate 
figure was obtained merely by having the 
institution’s classification officer estimate the 
proportion of released boys who would be 
reincarcerated in any institution within 9.5 
months. His estimate was 65%. The second 
procedure involved drawing a sample of 100 
case folders from the central record file in 
Columbus, Ohio, and tabulating the number 
of boys who had become reincarcerated in 
any institution within 9.5 months after re- 
lease from the institution. Sixty-two percent 
of these sampled boys had been reinstitu- 
tionalized within 9.5 months. Table 3 pre- 
sents a comparison of the groups in this study 
with the institution’s base rates, using the 
latter and more conservative estimate. 

The judged successfully treated group and 
the total therapy group both had a signifi- 
cantly greater proportion of boys staying in 
the community as compared to the base rates. 
For the control and judged nonsuccessfully 
treated groups, there was- no significant dif- 


TABLE 3 


A Comparison OF THE Instirution’s Base RATES 
WITH THE THERAPY AND CONTROL GROUPS 


Group Proportion Z 

Therapy subjects staying in 

community -683 

3.2944 

Base rate of staying in 

community 38 
Judged successfully treated 

subjects staying in the a 

community a NESA 
Base rate of staying in 

community 38 
Control subjects reinstitu- 

tionalized 61 
Base rate for reinstitutional- 

ization 62 
Judged nonsuccessfully treated 

subjects reinstitutionalized 73 va 
Base rate for reinstitutional- 

ization 62 

weep < 001. 
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TABLE 4 


LENGTH OF Time Each Group Was EMPLOYED 


Group X difference af t 


Therapy success 
3.9 mo. 22 
Therapy failure 


Therapy success 
3.0 mo. 24 2.56* 


Control success 


Therapy success 
4.3 mo. 23 3.23% 


Control failure 


*p <.02. 
“p <.01. 


ference from the base rate in proportion of 
recidivism. The different therapy groups com- 
pared with base rates yield similar results 
as the therapy-control group comparisons. 

Employment history provides yet another 
measure of community adjustment for the 
group of boys. Of all of the boys released, 
significantly more of the therapy boys were 
employed than control boys (p < .01). In 
considering just the boys successfully remain- 
ing in the community, there were signifi- 
cantly more therapy boys employed than were 
control boys (chi-square = 4.85, p < .05). Of 
the 24 boys in the therapy group who were 
employed, only 4 were reincarcerated while 
20 remained in the community (z = 4.66, 
p< .01). Twenty-eight therapy boys re- 
mained in the community; of these 28, 20 
were employed. Table 4 presents a compari- 
son of the length of time the boys in each 
group were employed. The successfully 
treated therapy boys were employed signifi- 
cantly longer than any other group. There 
were no significant differences as to length 
of time employed between the therapy failure, 
the control success, and control failure 
groups; however, there was a trend for the 
boys in the control success group to be em- 
ployed longer than the boys in the other two 
groups. 

Approximately 90% of the boys were re- 
leased to their homes with no significant dif- 
ference between control therapy, success, and 
returnee boys concerning type of community 
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placement. In an effort to compare the en- 
vironment to which the boys would be re- 
leased, the author carefully read each boy’s 
case folder and rated his community place- 
ment environment from 1 to 9, extremely 
poor to extremely good. These judgments 
were based upon such criteria as: broken 
home, relationship with parents or parent sur- 
rogates, delinquency rate of the home area, 
family and financial status, and job possibili- 
ties. Although it was possible for a boy’s 
home placement to be marked 9, in actuality 
the rankings were uniformly low and the 
highest ranking received by any boy was 5.5. 
There were no significant differences between 
the home situations of the therapy boys who 
remained in the community, therapy boys 
who were reinstitutionalized, control boys 
who stayed in the community, and control 
boys who were reinstitutionalized. Mean 
ratings for the groups were, respectively: 
2.94, 2.46, 3.16, and 2.79. Despite the crude- 
ness of such a home-rating method, another 
institutional psychologist? not associated with 
the project was asked to make similar ratings. 
The two separate ratings were strikingly in 
agreement with a .52 mean difference in the 
two rankings, and exact concordance in 73% 
of the homes rated. Sixty-eight percent of all 
the boys came from broken homes with 61% 
of the therapy returnees, 74% of the control 
returnees, 67% of the therapy community Te- 
mainers, 68% of the control community re- 
mainers coming from broken homes. The most 
powerful observation which seems to defy 
meaningful numerical expression, made from 
reviewing the records and listening to the | 
recorded interviews, was that almost every ` 
boy had never experienced a satisfactory rela- 
tionship with his father or father. surrogate. 
More often than not the relationship was 
extremely stormy or the father was not 4 
member of the family constellation. 

The results of this study seem to indicate 
that psychotherapy can be an important fac- 
tor in rehabilitation of delinquent youths. It 
should be particularly noted that only 5 of 
the 30 boys who were judged to be success- 
fully treated subsequently became reinstitu: 


2R, L, Uhl generously assisted in reviewing cases 
and making ratings. 
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tionalized. However, from these results it 
should not be construed that psychotherapy 
is a rehabilitative panacea. For maximum 
results it seems that a boy needs to have a 
successful therapy experience, a reasonably 
adequate community replacement, and em- 
ployment. There were also some notable ther- 
apy failures, such as one armed robbery and 
an armed robbery and murder. Nevertheless, 
the results indicate that psychotherapy 
helped most of the boys reverse their anti- 
social behavior and become more responsible 
individuals. 
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RELATIONSHIP OF DISTORTION TO DAP DIAGNOSTIC 
ACCURACY AMONG PSYCHOLOGISTS AT THREE 
LEVELS OF SOPHISTICATION * 
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24 psychologists (10 DAP users, 10 nonusers, and 4 projective test experts) 
were asked to categorize 48 DAP protocols as being the productions of or- 
ganics, paranoid schizophrenics, nonparanoid schizophrenics, or normals and 
to rate the drawings for distortedness. The accuracy of the judges’ diagnostic 
impressions was only a slight improvement over chance, and it was further 
found that diagnostic acuity did not vary with judges’ DAP experience or 
projective test sophistication. Although drawing distortion ratings were very 
highly related to every judges’ diagnostic impressions, they were uncorrelated 
with hospital-record diagnosis in any of the 24 cases. The latter finding suggests 
that many psychologists have very seriously overestimated the extent to which 
drawing distortion is useful as a diagnostic indicator. 


The Draw-A-Person (DAP) has fared 
rather poorly or, at best, inconsistently, in a 
number of studies designed to evaluate its 
usefulness as a test of general personality ad- 
justment (Griffith & Peyman, 1959; Royal, 
1949; Schaeffer, 1964; Sherman, 1958; Stoltz 
& Coltharp, 1961; Swenson, 1957; Whit- 
myre, 1953), Although it may be of value 
in measuring traits or personality character- 
istics of various sorts (e.g., Griffith & Pey- 
man, 1959; Hoyt & Baron, 1959; Mogar, 
1962), efforts at establishing it as a tool 
useful in the differentiation of diagnostic 
groups have only occasionally met with suc- 
cess (e.g, Albee & Hamlin, 1950; Holzberg 
& Wexler, 1950; Hozier, 1959), and many 
such positive results seem attributable to con- 
founding with age or IQ (as pointed out by 
Lewinsohn, 1965). For a more comprehensive 


1 This project was supported by the Veterans Ad- 
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Norman Tallent, Richard W. Thomas, Jerry Tom- 
linson, Robert Tucker, and Charles Van Buskirk 
(users) ; William H. Colley, Edward M. Ells, Warren 
Freiband, William G. Klett, Leonard Lipton, Patrick 
E. Logue, Gordon W. Olson, Vinton N. Rowley, 
Anthony B. Tabor, and Albert E. Uecker (non- 
users); Max L. Hutt, Bruno Klopfer, Pauline G. 
Vorhaus, and Karen Machover (experts). 


discussion of earlier findings the reader is 
referred to Swenson (1957), or for more 
recent findings to Lewinsohn as well as Hiler 
and Nesvig (1965). 

Nevertheless, the DAP continues to rank 
among the most frequently used tests in clini- 
cal settings (Sundberg, 1961). Its continued 
popularity as a diagnostic indicator suggests 
that the DAP’s supporters have largely dis- 
missed earlier discouraging research efforts. 
Presumably these psychologists feel that, de- 
spite the failures of judges in previous studies, 
there nonetheless exists a subset of clinicians 
(including themselves) in whose hands the 
DAP becomes a valuable diagnostic instru- 
ment. The nature and size of this hypothe- 
sized subset of capable DAP workers almost 
certainly varies from the perceptions of one 
psychologist to those of another, but it is 
reasonable to assume that, for most clinicians, 
it includes either/both (a) that group of 
experts in projectives whose contributions to 
the field have earned them national renown 
as projective test authorities or/and (2) those 
clinicians who use the test in their day-to-day 
practice. The existing skepticism as to the 
importance of earlier studies has also un- 
doubtedly been enhanced by some earlier m- 
vestigators’ practice of employing limited 
numbers of judges (i.e., very small Ns), and, 
in some cases, nonpsychologists as judges. , 

One of the two primary purposes of this 
study was to determine whether DAP diag- 
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nostic expertise does in fact vary with the 
psychologist’s level of experience with the test 
and/or his credentials as an expert in the 
projective field. 

The second major purpose was to study the 
relationship of drawing distortedness to diag- 
nosis. Frequently, psychologists seem to diag- 
nose patients—generally as schizophrenic or 
organic—largely or solely on the basis of the 
poor form level of the drawings at hand. This 
tendency, perhaps derived from Goodenough’s 
(1926) work on the complexity of children’s 
artistic production as a function of age and 
from the Freudian notion of schizophrenic 
regression, appears much in need of valida- 
tion. The need seems particularly pressing 
since many psychologists hold that normals 
produce drawings of generally good form level 
even though a large number infrequently—if 
ever—come in contact with the drawings of 
normals during their training and often have 
little firsthand knowledge as to their nature. 
The second goal of this project was to assess 
the relationship of distortion to diagnosis and 
further to evaluate the hypothesis that cli- 
nicians’ diagnostic impressions depend largely 
on distortion level. To this end, an attempt 
was made to compare the relationships of 
distortion ratings to diagnosis (@) as defined 
by hospital records and (b) as defined by 
psychologists’ clinical impressions from the 
DAP analysis. 


PROCEDURE 


Judges. Twenty-four judges were employed. Ten 
were clinicians who reported that they made regular 
Use of the DAP on at least an occasional basis in 
the course of their practices; of these 10, 9 stated 
they frequently employed the test while 1 reported 
Using it only on occasion. These judges were termed 
users” for purposes of this study. A second group 
consisted of 10 practicing clinicians who stated that 
they did not employ the test although they may have 
done so at one time (“nonusers”). All users and 
nonusers were psychologists known by the author 
or were individuals recruited for the study at the 
Suggestion of mutual acquaintances. A third group 
of four judges consisted of individuals who, through 
their extensive writings, had established themselves 
as experts in the field of projective techniques. None 
Bie personally known to the author, an all y 
olunteers approached by mail. During the co 
of the Heady 21 potential projective test experts 
Were approached, of whom 17 chose not to par- 
ticipate. Therefore the remaining sample of four 
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cannot be considered a random sample of projective 
experts by any stretch of the imagination. It should 
also be noted that of the four, only one made 
his/her reputation with figure drawings and this 
sample should not be construed as a group of DAP 
experts. From 1965 APA Directory entries, the 
reported amount of clinical experience for each judge 
was determined. The means for the user and nonuser 
groups did not differ significantly (t= .59, df= 18, 
overall mean = 11.3 years, SD = 5.7) but the pro- 
jective test experts, with a mean of 29.8 years of 
experience (SD = 5.0), were significantly higher in 
this respect than the 20 nonexperts (t=5.7, 
df= 22). All four projective test experts, three users 
and one nonuser, were ABEPP diplomates. With the 
exception of one expert, all judges held PhD degrees. 

Drawings. Four sets of 12 drawings each were 
employed. Thirty-six were products of neuropsychi- 
atric patients at the Veterans Administration Hos- 
pitals, Knoxville, Iowa, and St. Cloud, Minnesota. 
Of the 36, 12 were men diagnosed chronic brain 
syndrome. Etiological variables associated with brain 
damage were trauma (six cases), alcoholism (one), 
infection (one), Alzheimer’s disease (one), Wilson’s 
disease (one), idiopathic epilepsy (one), epilepsy 
and temporal lobe resection (one). These organics 
all showed strong clinical and/or laboratory evi- 
dence of cerebral lesions. Twelve subjects (Ss) bore 
the diagnosis paranoid schizophrenia and 12 were 
patients diagnosed schizophrenic but not labeled 
paranoid (“nonparanoid schizophrenics”), The re- 
maining 12 were produced by full-time employees of 
the Housekeeping Department at the St. Cloud hos- 
pital who had volunteered to serve in this project. 
Housekeepers were chosen because of the likelihood 
that their vocational and socioeconomic achievement 
would more nearly approximate those of the pa- 
tients than would those of any other group of avail- 
able employees. All Ss were males under the age of 
50 without history of lobotomy or more than 25 
electroshock treatments, No member of the three 
clinical groups had ever borne a diganosis sug- 
gesting he might be a candidate for one of the 
other two samples. According to hospital personnel 
records, only one of the housekeepers had a history 
of mental illness; he had been hospitalized for 6 
weeks a full 12 years prior to the running of the 
study. i 

The four groups did not differ significantly (at 
05) with respect to mean age (F=141; df=3, 44; 
M =372) or Revised Beta IQ (F = 2.39, df =3, 44; 
M=97.2), and the three clinical samples did not 
differ on mean length of psychiatric hospitalization 
(F=2.66; df =2, 33; M =33.2 months). 

However, the F for education level (2.94; 
df =3, 44) was significant at .05, the group means 
being 11.1 (organics), 10.8 (paranoids), 10.9 (non- 
paranoid schizophrenics), and 8.8 (normals). In 
order to determine whether the educational differ- 
ences might have affected distortion ratings (to be 
discussed below), the correlation between education 
and mean distortion rating over all 24 judges was 
determined separately for each group of 12 Ss, The 
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test that two independent populations have the same 
correlation (Walker & Lev, 1953, pp. 255-256) was 
applied to the largest and smallest r’s (those for the 
paranoid and nonparanoid schizophrenic groups). 
Since the s was only .977, it was logical to assume 
the true correlation did not vary from group to 
group. At this point, then, it was logically pos- 
sible to recalculate the r over all 48 Ss. The result- 
ing coefficient, .155, was not significantly different 
from zero (t = 1.06, df= 46, p > .20), and the effect 
of the education differences on distortion was con- 
sidered unlikely to have been of any appreciable 
consequence in the present investigation. 

A male and a female figure drawing (one each) 
were elicited on unlined white 8X 10 inch paper, 
one drawing to a sheet. For presentation to the 
judges, Ss’ names were covered with masking tape. 
On each drawing, the examiner penciled in S’s ran- 
domly assigned serial number, the sex of the figure 
(as reported by S) and whether the production 
was the first or second drawn. Judges were informed 
that, of the 48 drawings, 12 each were produced by 
patients diagnosed chronic brain syndrome, paranoid 
schizophrenic, and “nonparanoid” schizophrenic and 
that the remaining dozen were drawn by full-time 
hospital housekeepers. The judges were asked to sort 
the protocols into equal-sized groups corresponding 
to the categories above. In addition they were asked 
to rate each protocol on a normally distributed 
11-point scale reflecting distortion, which was defined 
as “lack of similarity to real human forms.” The 
reliability of the distortion ratings over the 24 judges 
was obtained by use of Ebel’s (1951) intraclass 
correlation technique; the resulting coefficient 
was .70. 


RESULTS 


Number correct. The overall mean number 
correct was only 13.46 (where 12 could be 
anticipated by chance) with a range from 10 
to 19. Nevertheless, the mean number correct 
exceeded the chance expectation at the .01 
level (f= 2.98, df = 22). 

The number of protocols correctly identi- 
fied varied extremely little with the degree to 
which the judges used the test. The F for dif- 
ferences in total correct between the three 
groups was very low (.08, df= 2, 21) and 
showed no evidence of approaching signifi- 
cance even had a larger sample of judges 
been employed. Moreover, the psychologists’ 
ability to diagnose correctly (number cor- 
rect) did not vary with years of clinical 
experience (7 =.002) nor with American 
Board of Examiners in Professional Psychol- 
ogy (ABEPP) status (t= .05, df = 22, ns). 

` However, the mean number correct appar- 
ently did vary with diagnostic category. Over 
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all three levels of experience, mean correct 
for organics, paranoids, nonparanoid schizo- 
phrenics, and normals (with chance equaling 
three) were 3.96, 4.42, 2.87, and 2.17, re- 
spectively. The totals correct for organics and 
paranoids exceeded chance at the .01 level 
(Ps = 4.17 and 5.07, df’s = 23) while that 
for the normals was significantly less than 
chance (ż= 3.77, df = 23, p< .01). (The 
reader should interpret the above statistics 
with caution since judges knew how many 
members of each diagnostic/control category 
were present and the observations were 
therefore not totally independent.) 

Distortion ratings. Differences between the 
mean distortion ratings of the four diagnostic 
groups as defined by hospital records were 
computed separately for each judge. Of the 
24 Fs thus calculated, only four (two users 
and two nonusers) were significant at the .20 
level while none reached significance at .10. 
Although the reader should again note that 
the ratings were not independent, it is quite 
apparent that the relationship between diag- 
nostic-category membership and drawing dis- 
tortion as defined by the psychologists was, 
for all practical purposes, nil. 

Differences between the distortion-rating 
means of the four diagnostic groups as de- 
fined by the judges’ own clinical impressions 
were also analyzed, separately for each psy- 
chologist, by F tests. In contrast to the ab- 
sence of impressive differences between the 
distortion-rating means of the four groups 
as developed from hospital records, all the 
Fs were highly significant. The F ratios 
ranged from 4.20 (df = 3, 44, p < .025) to 
103.33 (p < .001) with a mean F of 15.80. 
Seventeen of the 24 Fs were significant 
beyond the .001 level while 6 others reached 
005 and 1 was significant at only 025. 
Obviously, there was very strong relationship 
between psychologists’ drawing distortion 
ratings and their diagnostic impressions. 

It is also interesting to note that all 10 
nonusers’ Fs were significant at .001 (mean 
F =22.75) while only 4 of the users’ 7S. 
(mean F = 7.42) reached that level of sig- 
nificance. It appears that psychologists, when 
forming diagnostic impressions from DAPs, 
rely extremely heavily on the extent to which 
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drawings differ from human forms, but that 
the extent to which they do so may vary 
inversely with their level of experience with 
the test. 


Discussion 


Although clinical psychologists (at least as 
a whole) apparently are able to improve 
upon chance in their diagnostic efforts with 
DAP, the degree of improvement is so slight 
as to cast considerable doubt on the fruitful- 
ness of DAP diagnostic impressions. In fact, 
the results suggest that reliance on the DAP 
as a diagnostic indicator is actually detri- 
mental to overall diagnostic acuity, except in 
those rare settings where the base-rate fre- 
quencies of the various nosological categories 
are essentially equal (Meehl & Rosen, 1955). 
Even in such settings, and regardless of the 
psychologist’s level of experience with the 
test, accuracy apparently varies with (hos- 


fital-file) diagnosis and, from the above 


tesults, it appears quite possible that the test 
is frequently of no value or may have a dele- 
terious effect on diagnostic accuracy when the 
choice to be made involves certain foils, in- 
cluding “normal” or “nonparanoid schizo- 


| phrenic.” 


This is not to suggest that the DAP is of 
no value to clinicians, In a number of studies, 
authors have found various DAP character- 
istics related to an assortment of personality 
traits, such as delusions of reference (Grif- 
fth & Peyman, 1959), differential sex role 
perceptions (Cook, 1951), improvement in 
Psychotherapy (Gutman, 1952), anxiety 
(Hoyt & Baron, 1959; Mogar, 1962), anti- 
Social tendencies (Baugh & Carpenter, 1962), 
and popularity (Richey & Spotts, 1959), to 
name a few. Rather, the present findings indi- 
tate that the usefulness of DAP analysis in 
the development of diagnostic decisions ap- 


Pears to be a questionable procedure— 


tegardless of the psychologist’s familiarity 
(and probably, self-confidence) with the test. 
The notion that a “trait” approach to the 
JAP is more valuable than a diagnostic one 
'S also supported in a particularly striking 
Manner by Griffith and Peyman, who found 
eir judges able to predict ideas of reference 
tom DAP eye-ear emphasis, but unable to 
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differentiate diagnosed paranoids from schizo- 
phrenics previously placed in the nonparanoid 
categories. 

Obviously such failures and the difficulties 
encountered by judges in the present study 
may well stem from the unreliability of diag- 
noses rather than from shortcomings of the 
test or its users but, even if the fault lies 
with the criterion rather than the test, the 
use of DAP diagnoses appears unwarranted 
since, in reality, correspondence to estab- 
lished hospital diagnosis is the criterion of 
excellence actually sought by most clinicians. 
Indeed, it is probably significant that no less 
an expert than Machover* has expressed 
“grave misgivings” about the use of the test 
in diagnostic categorizations, at least out of 
the context of the patients’ age, cultura’ 
background, etc. 

The findings seem to indicate quite strongly 
that distortion is unrelated to hospital diag- 
nosis despite its high correlation with psy- 
chologists’ diagnostic impressions, The fact 
that psychologists seem to view distortion as 
an extremely important indicator of pathol- 
ogy is particularly disturbing in view of the 
absence of any evidence for such a relation- 
ship in the present study. 


8 Personal communication, 1964. 
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PREDICTION OF VICTIMIZATION FROM AN INSTRUMENTAL 
CONDITIONING PROCEDURE* 


G. R. PATTERSON 


University of Oregon 


The report presents a description of an instrumental conditioning apparatus 
designed to serve as a personality assessment procedure. In this procedure, 
a visual stimulus was made contingent upon a lever-pressing response. The 
magnitude of the change in rate of responding constituted the dependent 
measure. The stimuli consisted of cartoons portraying social behaviors that 
had in the past for the children studied been associated as conditioned stimuli 
for pain, Each of the Ss was observed over a 5-wk. period in a nursery school 
setting. The number of occasions on which each child was victimized by an 
aggressor was recorded. It was predicted that children who had been more 
frequently victimized would show a greater impairment in the rate of 
responding when such stimuli were made contingent upon lever pressing. 


The prediction was confirmed. 


Traditionally, assessment devices have pre- 
sented the subject with a set of stimuli 
(items) and, after grouping the elicited re- 
sponses into a subset (scale), have made 
predictions to external criteria of various 
kinds, For discussion purposes, it is reason- 
able to view this approach as a poorly 
controlled variant of standard laboratory 
procedures. The traditional assessment pro- 
cedures are viewed as poorly controlled be- 
cause the variables determining the behavior 
of the subjects are relatively unspecified. Al- 
though the subject’s behavior may consist of 
a quantifiable response such as marking a 
question as “true” or “false,” the variables 
controlling the response may be the content 
of the item, or any one of a half dozen 
response sets. Occasionally, the self-report 
behavior is random. It is not surprising that 
such data rarely account for more than 10% 
of the criterion variance. In spite of the fact 
that these procedures seldom account for a 
large amount of variance for any single cri- 
terion, some of these devices such as the 
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Minnesota Multiphasic Personality Inventory 
can account for small amounts of variance 
across an impressive array of criteria. It is evi- 
dent that whatever assessment devices might 
be constructed for the clinician they must 
eventually make provision for a similar range 
of criterion coverage. The writer proposes 
that some traditional laboratory procedures 
might be adapted to assessment problems. 
The eventual utility of such an application 
would be a function of its ability to provide 
measures which can make predictions to a 
significant range of criterion behaviors. 

The present report summarizes the last in 
a series of six pilot studies designed to 
explore a possible contribution of one type 
of laboratory procedure to the assessment 
process. For example, operant technology 
could provide better controls by limiting the 
number of variables which are operating in 
determining the behavior of the subject. Pre- 
sumably, these better controls should provide 
better response data which in turn could 
account for more of the variance in the cri- 
terion measures than has been the case for 
the traditional assessment devices. 

Aside from the allure of tighter controls, 
there seems to be little else about operant 
technology that immediately recommends it 
to the assesment enterprise. For example, it 
is possible to conceive of changes in rate of 
response as a substitute for marking “true” 
or “false” to items. However, it is not always 
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clear what the appropriate independent vari- 
ables might be in the laboratory-assessment 
procedure which would provide for predic- 
tions across a wide range of social behaviors. 
What is required before seriously entertaining 
the possibility of such an approach is a gen- 
eral paradigm for systematically programming 
independent variables within the operant 
framework. To have maximal general utility, 
the laboratory procedures would have to lend 
themselves to great flexibility in the program- 
ming of independent variables. Flexible pro- 
gramming would imply that some provision 
be made for systematically sampling a wide 
range of stimuli to be used as reinforcers in 
the operant task. Presumably, each set of re- 
inforcers would make it possible to predict 
a different set of criteria. Given that a clini- 
cal psychologist can specify a list of criterion 
behaviors for which predictions would be of 
some value, then some hypothetical “system- 
atic framework” would be used to specify the 
subsets of reinforcer stimuli to be used in 
the laboratory task in order to make predic- 
tion to each criterion, The effect of these 
stimuli in controlling the behavior of the 
subjects would be precisely measured and 
would provide the basis for making predic- 
tions to the criterion. For present purposes, 
the immediate question concerns the frame- 
work which will be used to select the stimuli 
to be introduced in the laboratory setting. 
The main purpose of the present report was 
to provide one possible approach to this 
problem together with a preliminary set of 
data testing its effectiveness. 

First, it was necessary to introduce the 
obvious assumption that there are individual 
differences in subjects’ responsiveness to any 
given class of reinforcing stimuli. These dif- 
ferences in responsiveness should reflect, in 
part, differences in past conditioning histories. 
Presumably, the criterion behaviors which we 
wish to predict are also outcomes of a similar 
conditioning history. For example, for any 
individual there are a number of social be- 
haviors which are likely to become associated 
with aversive stimuli. Although the specifics 
may vary from individual to individual, such 
pairings occur in the life of most people living 
in this culture. For most, the sight of the 
dentist’s uniform and his drill becomes condi- 
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tioned stimuli (CS) for pain reactions. How- 
ever, individuals differ in terms of the number 
of such pairings and intensity of the uncondi- 
tioned stimuli (UCS) characterizing these 
associations. If, in an operant conditioning 
procedure, a picture of the dentist’s drill were 
made contingent upon lever pressing, it 
would be expected that such a contingency 
might have a mild suppressing effect upon the 
rate of responding (Church, 1963; Solomon, 
1964). It would also be expected that indi- 
viduals who had had a large number of these 
associations of the CS to pain would show a 
greater suppression of rate of responding than 
would individuals who had had fewer, or less 
intense, original pairings. As a general case, 
it was assumed that there was a linear, mono- 
tonic function holding for the relation be- 
tween the frequency and intensity of the 
pairings for CS and UCS on the one hand, 
and the effect of CS in suppressing the rate 
of responding when made contingent upon 4 
response in an instrumental conditioning task. 

The criterion used in the present study 
consisted of a set of behaviors which occur 
rather frequently in the socialization of most 
children. The criterion was “frequency of 
victimization.” In an observational study 
reported by Bricker and Patterson (1964), 
data were collected on the frequency 
with which children were attacked by other 
children. The data showed impressive indi- 
vidual differences in the number of oc- 
casions, on any given day, in which a child 
would be “victimized” by other children. For 
example, some children were struck as often 
as 70 times, while others were not assaulted 
at all. The data also showed moderate stabil- 
ity in the frequency with which individual 
children were victimized. The correlations 
between the median rankings of victim status 
were .54 for a 6-week period and .40 for 4 
9-month period. 

The behavior displayed by the attacker was 
most frequently such responses as “hit with 
hand,” “push,” or “hit with object.” These 
behaviors of the attacker were viewed 4S 
CS’s for the pain which followed. It wa 
predicted that for some children pictures 9 
children hitting or pushing other children 
would function as being mildly aversive. In 
an instrumental conditioning procedure, * 
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lever pressing produced displays of such pic- 
tures, it would be predicted that for some 
children the rate of lever pressing would be 
moderately impaired. More to the point, it 
was predicted that the magnitude of the im- 
pairment effect would correlate significantly 
with the number of times a child had been 
observed to have been victimized. 


METHOD 
Sample 


The subjects were 23 children from two nursery 
schools. Their ages ranged from 3 to 4 years. The 
sample was equally divided as to sex. Most of 
these subjects were from middle- and upper-class 
families. 


Procedures 


The laboratory apparatus consisted of a slide 
projector which was pro; ed to present a 
stimulus on a fixed interval schedule (5 seconds). 
The schedule was operative only if the child con- 
tinued responding. The instructions to the child were 
as follows: 


There are pictures in this machine and you can 
make them come in this window by tapping the 
button on the desk like this [experimenter demon- 
strates]. You can tap the button and you will see 
pictures in the window. When a picture comes, 
look at it, but keep pressing the button. Keep 
tapping and looking at the different pictures in 
the window until I tell you to stop. 


A series of studies has been carried out with 
the apparatus (Patterson & Littman, 1963). These 
studies have shown that if pictures which are gen- 
erally of interest to children are presented on the 
screen at a fixed ratio schedule there is a significant 
Mectease in rate of responding. These findings are 
in keeping with the research findings by Munsinger 
(1964) to the effect that any meaningful stimulus 
tan function as a reinforcer and that reinforcer ef- 
fectiveness varies as a function of the “amount” of 
Meaning, In the pilot study, 15 children from the 
a grade showed an increase of 1.0 response per 

e. 

Tn the next pilot study, eight children from the 
Present sample participated in the procedure under 
Conditions of nonreinforcement. Only those children 
Were selected who had been observed to display 
Neither extreme aggressiveness nor extreme passivity. 

the procedure, 40 slides were presented which 
depicted lines tilted at various angles. The first 
Gestion raised concerned the sampling of behavior 
Necessary to establish a stable estimate for rate of 
esponding. The data showed that the median rate 
Of responding based upon Slides 6 through 10 cor- 
‘lated .87 with the median rate of key pressing 
during the remainder of the trial. The young child 
Semed to require a certain amount of participation 
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in the procedure as a settling down or adaptation 
period. As might be expected, under conditions of 
nonreinforcement there was a marked decrease in 
the rate of responding. The median difference be- 
tween the block of Slides 5-10 and the remainder 
of the trial was —1.0 (responses per slide). 

The 15 subjects remaining in the sample were 
used for the experimental study. Ten slides depicting 
tilted lines constituted the base operant period. 
During the conditioning period, 18 slides were pre- 
sented, These latter slides presented aggressive inter- 
actions among children. These slides were cartoons 
taken from an objective test designed to predict 
aggressive behaviors in children (Patterson, 1960). 
The number of key-tapping responses was recorded 
for each slide. The difference between the median 
number of key taps for the base operant slides 
(6-10) and conditioning slides (11-28) constituted 
a measure of behavior change. 

The criterion estimate of “frequency of victimiza- 
tion” was obtained during a 5-week period of ob- 
servation. An observer was present in each classroom 
for each of the periods and recorded each aggressive 
episode including the aggressor, the aggressive re- 
sponse, the name of the victim, and the consequence 
provided by the victim. The total number of times 
in which a child was attacked provided an estimate 
of the occurrence of these events for each child. 


RESULTS 


Because the cartoons had associative value 
(were meaningful), it was expected that some 
children would respond to them as if they 
were positive reinforcers and display an in- 
crease in rate. For other children, the car- 
toons were associated with aversive stimuli. 
This latter group of children would be ex- 
pected to respond to the pictures with a mod- 
erate decrease in rate, The average change 
in rate for the total group was —.37 responses 
per slide. A ¢ test of the distribution of dif- 
ference scores showed this value was not 
significantly different from zero. : 

As pointed out in the volume edited by 
Harris (1963), the measurement of change 
is an extremely complex problem. When using 
difference scores of the type provided in this 
study, there will undoubtedly be a correlation 
between baseline rate and the magnitude of 
the difference score. Correlations of = 63 
were reported for an instrumental condition- 
ing procedure which was similar in many 
respects to the one being presented in the 
present report (Patterson & Hinsey, 1964). 
For this reason, a simple correlation between 
the change score and the criterion measure 
would be confounded by baseline differences. 
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As recommended in Harris (1963), the par- 
tial correlation coefficient was used to partial 
out the baseline rate. 

In spite of the weak general conditioning 
effect, the data from both schools showed that 
the more frequently a child had been victim- 
ized, the more the aggressive stimuli dis- 
rupted his rate of responding. The partial 
correlation for one school was .52 (N = 7); 
the comparable correlation for the other 
school was .70 (N = 8). 

These data constitute an impressive begin- 
ning. Not only do they offer support for the 
general assumptions being made here, but the 
magnitude of the correlations suggested a 
relation which is at least comparable to the 
findings obtained by traditional psychometric 
devices. 


Discussion 


The data presented in this report provide 
support for the general feasibility of con- 
structing laboratory procedures to fulfill tra- 
ditional assessment functions. The procedure 
described in the present paper is clearly ex- 
ploratory. However, the results showed that 
it was possible to make predictions to a cri- 
terion of social behavior. The level of success 
in making these “predictions” about concur- 
rent behavior was better than might have 
been expected. 

Extrapolating a bit, it seems plausible to 
assume that one could sample other sets of 
social behaviors which had been conditioned 
to various UCS’s in the lives of most indi- 
viduals. For example, in the case of children 
it is a frequent and prime insult to accuse 
another child of being “a baby,” or to accuse 
a boy of being “a sissy.” Photographs of 
older children dressed as infants or of young 
boys dressed as girls would probably serve 
as aversive stimuli. For some younger chil- 
dren, being left by the parent is another 
conditioned stimuli associated with fear. In 
all three cases, it would be predicted that 
making such classes of stimuli contingent 
upon a lever-pressing response should result 
in a mildly suppressive effect upon the rate 
of responding for some children. It is also 
reasonable to assume that there would be 
individual differences among children in terms 
of the magnitude of this impairment effect 
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upon the rate of the instrumental response. 
In the examples given above, the criterion to 
which one would relate these differences in 
effect would not be difficult to imagine. 

In keeping with the procedures outlined 
thus far, it should be possible to make pre- 
dictions to a wide spectrum of those social 
behaviors which in the past have been fre- 
quently associated with pain or discomfort. 
However, this line of reasoning would also 
provide its own set of limitations. For ex- 
ample, one could conceive of the set of social 
behaviors subsumed under the terms “achieve- 
ment” or “responsible” as being associated 
with aversive stimuli, but more likely such 
contingencies played a rather minor role for 
the acquisition of such behaviors in the condi- 
tioning histories of most people. To the extent 
that aversive associations played a minor role 
in the acquisition and maintenance of the 
criterion behaviors, the laboratory assessment 
procedures described thus far would be of 
limited value. 

With these limitations in mind, it might 
be of some interest to consider a set of con- 
ditions in which assessment devices could be 
constructed that would have broader gen- 
erality. The laboratory studies completed by 
Munsinger (1964) in our laboratory may 
provide a basis for extending the assessement 
paradigm. He showed that the reinforcing ef- 
fectiveness of stimuli varied as a function of 
the associative value of the stimulus. In gen- 
eral, the more meaningful a stimulus the 
greater its effect in strengthening a response. 
The stimulus which elicited a large number 
of associations proved to be a more effective 
reinforcer when this stimulus was „made 
contingent upon a motor response I an 
instrumental conditioning procedure. ul 

These findings provide an interesting basis 
for speculation about possible differences 
among individuals. For example, it seems 
likely that there are broad classes of social 
behaviors which would elicit a large number 
of associations from all individuals. By the 
same token, descriptions of these behaviors 
would function as effective reinforcers fo! 
most individuals if these descriptions wet 
made contingent upon a response in an m- 
strumental conditioning task. It seems doubt- 
ful that a laboratory procedure could be 
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devised which would be sensitive enough to 
differentiate among individuals using stimuli 
which were uniformly of high associative 
value. However, it is likely that there are 
some social behaviors which are rather unique 
to the individual and which would concomi- 
tantly be of high associative value to him, for 
example, terms such as “hackle,” “caddis,” 
or “coachman” would have some additional 
associations for the fly fisherman. Presum- 
ably, visual representations of such stimuli 
would be more effective as reinforcers for 
these individuals. 

When presented with a particular criterion 
of social behavior to which predictions are 
to be made, the experimenter may decide 
then to sample only those behaviors which 
would be rather unique to the criterion. Pre- 
sumably, these behaviors would be uniquely 
reinforcing (high association value) to a 
small group of subjects. For example, if the 
criterion to be predicted was delinquent be- 
havior it would seem plausible to construct 
photographs of behaviors relating to a vari- 
ety of social behaviors, for example, riding 
motorcycles very fast, siphoning gas, stealing 
hub caps, stealing cigarettes from a store, 
fighting, violating traffic signals, showing dis- 
respect for a teacher. Undoubtedly, some of 
these would be reinforcers for the nondelin- 
quent adolescent, but the delinquent has quite 
likely a broader hierarchy of such behaviors. 
For the latter, the stimulus set should func- 
tion as more effective reinforcers. 

Before considering extensive testing of the 
general assumptions, there are several prob- 
lems which must be met. Part of the difficulty 
lies in identifying the most effective measure 
of response strength to be used in the labora- 
tory task. There is no single, best measure 
of response strength. As shown in the review 
by Parton and Ross (1965), time-dependent 
Measures of response strength tend to be 
tather unstable. Time-independent measures, 
while they are more reliable, are also more 
likely to be confounded by the effects of 
the structure of responding which occurred 
just prior to the advent of the reinforcer 
(Patterson & Hinsey, 1964). In addition, if 
the reinforcement schedules are simple one- 
to-one contingencies, there is little reason to 
doubt that the mediational processes will 
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determine some of the variance in meas- 
ures of reinforcement effect (Farber, 1963; 
Kanfer, 1966). 

This listing of a number of potential con- 
tributors to variance is intended to give 
pause to investigators who might believe that 
to solve the problems of assessment it is 
only necessary to “condition” a subject and 
to correlate the laboratory measure with a 
criterion. It is the impression of the writer 
that some of these problems in the measure- 
ment of reinforcement effect can be met, and 
that the possible contribution to assessment 
by these laboratory procedures merits the 
effort. For example, subjects could be classi- 
fied as to the kind of response structure oc- 
curring prior to the introduciton of the rein- 
forcement contingencies (Patterson & Hinsey, 
1964). The different classes could then be 
analyzed separately to determine the relations 
holding between measures of reinforcement 
effects and criterion variables, Also it should 
be possible to use more complex conditioning 
technologies of the kind described by Heffer- 
lene, Keenan, and Birch (1963) in which the 
ostensible contingency, for example key tap- 
ping to produce pennies, was irrelevant and 
the “real” contingency involved an eye-blink 
response. 

Investigators should be encouraged to 
undertake these problems not with the object 
that laboratory procedures will provide a 
panacea for the problems besetting modern 
day assessment but rather as an approach 
which is potentially able to feed a very 
different kind of data into the assessment 
model. 
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DIMENSIONALITY OF BARRON’S EGO-STRENGTH SCALE? 


KENNETH B. STEIN anv CHEN-LIN CHU? 
University of California, Berkeley 


A sample of 310 Ss composed of 90 normals, 150 anxiety reactions, and 
70 schizophrenics was used in a cluster analysis of Barron’s 68-item Es scale. 
The 5 oblique clusters which emerged were: (a) emotional well-being, (b) 
cognitive well-being, (c) physical well-being, (d) religious attitude of non- 
belief and nonparticipation, and (e) seeking heterosexual stimulation and 
escape from boredom. In a hierarchical analysis, it was found that the first 
3 clusters could be combined in a single condensed cluster called sense of well- 
being. Consistent significant mean differences were found between the normal 
and abnormal groups in both the original sample as well as in a replicated 
sample of 100 psychiatric and 100 normal Ss for the well-being clusters but 
not for the religious and heterosexual clusters. The results are discussed both 
in terms of their empirical and conceptual relevance to the ego-strength 


construct, 


The construct ego strength (Es), which has 
its roots in psychoanalytic theory, has had 
wide currency in psychological theory and 
research, particularly in the fields of clinical 
psychology and personality. Test scores and 
scales have been developed as operationally 
defined measures of Es and these have then 
been related to other variables either in hy- 


1This research was supported in large part by 
Public Health Service Research Grants MH 0811-01 
to 04 and MH 18134-01 to 05, from the National 
Institute of Mental Health, R. C. Tryon, principal 
investigator, A complete set of tables, presenting the 
correlation matrix as well as the oblique factor co- 
efficients and communalities of the items for both 
the five and three dimensional analyses, has been 
deposited with the American Documentation Insti- 
tute, Order Document No. 9194 from ADI Auxil- 
iary Publications Project, Photoduplication Service, 
Library of Congress, Washington, D. C. 20540. 
Remit in advance $1.75 for microfilm or $2.50 for 
Photocopies and make checks payable to: Chief, 
Photoduplication Service, Library of Congress. 

We wish to express our thanks to Robert 
C. Tryon and the late Richard Sears (who col- 
lected the data of the Es on the two psychiatric 
samples) for turning over the data to us for this 
analysis, To these two we added a third sample 
of normals. For this third sample we wish to thank 
Donald W. MacKinnon, Director of the Institute 
of Personality Assessment and Research, for permis- 
sion to use the MMPI data on the military officers 
for which we are grateful. Jack Block was kind 
enough to make available his set of punch cards 
with all of the MMPI item responses on these 
Officers, 

2Chen-Lin Chu is now at the University of 
Michigan. 


pothesis testing or in shotgun approaches 
seeking correlates of the test measure, 

One instrument which has enjoyed con- 
siderable popularity has been Barron’s Es 
scale (1953a, 1953b, 1956). This scale was 
derived from the MMPI and consists of 
items which differentiated patients who re- 
sponded to psychotherapy from those who did 
not. Although originally conceived as a scale 
of response to psychotherapy, Barron’s 
studies led him to conclude that the scale 
measured Es. Since the development of the 
Es scale over a decade ago, a wide array of 
empirical findings has accrued. Partly the 
appeal of this instrument stems from the fact 
that it is brief, simple, and self-administering. 
Further, the Zs scale can be easily scored 
from the total MMPI. 

The validity of Barron’s Es scale has been 
tested in a variety of studies, A number of 
these have produced confirmatory evidence 
for the validity of the scale (cf. Gottesman, 
1959; Himelstein, 1964; Kleinmutz, 1960; 
Quay, 1955; Silverman, 1963; Taft, 1957; 
Wirt, 1955). Other studies have failed to 
find such validity, particularly in regard to 
discriminating Es scores between clinical 
groups such as neurotics and psychotics, or 
have failed to substantiate Barron’s original 
finding of response to psychotherapy (cf. 
Getter & Sundland, 1962; Quay, 1963; Sul- 
livan, Miller & Smelser, 1958; Tamkin, 1957; 
Tamkin & Klett, 1957). 
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Investigators using the Es scale have dis- 
covered a number of interesting relationships. 
Roessler, Alexander, and Greenfield (1963), 
Alexander, Roessler, and Greenfield (1963), 
and Greenfield, Alexander, and ‘Roessler 
(1963) noted that Es scores tend to relate 
to physiological responsitivity. Korman 
(1960) related Es scores to resolution of 
discrimination conflict (Block, 1961, failed 
to replicate), while Cline, Meeland, Egbert, 
Brown, Spickler, and Forgy (1956) were able 
to differentiate fighters from nonfighters in 
the military. Barron (1963) related Es to 
creativity and originality. 

The above studies shed some light upon 
understanding Barron’s scale through its rela- 
tionship to external and independent vari- 
ables. Another approach to such understand- 
ing is through an analysis of the internal 
structure of the scale itself. Barron (1953a) 
formed rational groupings based upon an 
inspection of the items. The eight categories 
which he formed were as follows: Physical 
Functioning and Physiological Stability; 
Psychasthenia and Seclusiveness; Attitudes 
toward Religion; Moral Posture; Sense of 
Reality; Personal Adequacy, Ability to Cope; 
Phobias, Infantile Anxieties; and a miscel- 
laneous group. Crumpton, Cantor, and 
Batiste (1960) factor analyzed the scale and 
found 14 factors of varying degrees of inter- 
pretability. This study had several limita- 
tions. First, the capacity of the computer 
program could handle only 62 of the 68 items 
of the Es scale, and, second, the patient and 
control groups were not matched for age and 
education. The authors draw the conclusion 
that Barron’s scale measures the absence of 
ego weakness rather than the presence of ego 
strength. 

The present study is addressed to the 
derivation of the internal structure as a means 
of understanding the type of factors which 
compose this scale. Toward this end three 
subgroups representing a broad range of ad- 
justment and ego strength were selected. A 
computer program that could handle all 68 
items was used. Following the derivation of 
the internal dimensions, each of these dimen- 
sions will be compared between criterion 
groups of supposedly greater and lesser ego 
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strength. Such comparisons should provide a 
validity check in order to determine which 
of these factors are reflections of Es. To check 
on the reliability of these findings, cross- 
validational groups will be compared on each 
of the derived dimensions. 


METHOD 
Subjects 


The initial sample of 310 male subjects consisted 
of the following subgroups: a total of 70 schizo- 
phrenic (T. Schiz.) composed of 32 paranoid (P. 
Schiz.) and 38 other schizophrenics (O. Schiz.), 150 
anxiety reactions (Anx.), and 90 military officers 
(Normals). The psychiatric subjects were outpatients 
in several Veterans Administration clinics and their 
diagnoses were carefully checked. Each of the 
schizophrenic patients had at least one hospitaliza- 
tion for his condition within a 5-year period pre- 
ceding the administration of the MMPI in the clinic. 
The anxiety subjects had no such history of hospi- 
talization for a psychiatric condition. 

Eighty-five of the 90 officers were part of 100 
who took part in the live-in assessment at the 
Institute for Personality Assessment and Research, 
University of California, Berkeley. The schizo- 
phrenic, anxiety, and normal subjects were matched 
for age and education with a mean age of approxi- 
mately 33 years and a mean education of approxi- 
mately 13 years. 

A second sample which was used for cross-yalida- 
tional purposes consisted of 100 general psychiatric 
patients in a Veterans Administration clinic and 
100 general medical patients who were examined 
and considered free of psychiatric involvement. The 
mean age was slightly older than the initial sample, 
approximately 36 years, while the average education 
was 12 years, slightly lower. 


Procedure 


The procedure used in the analysis of the 68 Es 
items was the BC TRY system of cluster analysis. 
A detailed description of this system can be found 
elsewhere (Tryon, 1958; Tryon & Bailey, 1965, 
1966). Briefly, the system is composed of a series 
of component programs which guide the data 
through successive stages of analysis by means of 
the 7094 IBM computer. There are various options 
at different stages of analysis, but the more standard 
programs were used in the current analysis. These 
consist of preparing the data for processing (DAP) 
by later components; correlation matrix (COR?) i 
diagonal values program (DVP41); the dimensional 
analysis or factoring (CC5); and the description 0 
the oblique structure of the dimension-definins 


3 We wish to express our appreciation to Jerome 
Fisher, Christine Miller, William Riess, and Alex 
Nemeth for making available the second set of 
samples of medical and psychiatric subjects. 
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clusters both statistically (CSA2) and by geometric 
configuration (SPAN2). Several runs with revised 
definers were made which involved repeating the 
CC5, CSA2, and SPAN2 components. This analysis 
which focuses on the variables (in the present 
research, items) is called the V analysis. 

One of the features of the CSA2 component is 
that it provides the alpha reliability of each cluster 
or dimension based on the defining variables as well 
as the reliability singly and cumulatively of addi- 
tional nondefining variables, Thus, if the investi- 
gator were interested in developing multidimensional 
scales which could be used singly, or configuration- 
ally, as, for example, in an object or inverse analysis, 
the data on reliability would be invaluable. This 
same CSA2 component also provides the generality 
of each cluster based on the original rather than the 
residual matrix. 


RESULTS 


The cluster analysis produced five clusters, 
three highly oblique and two orthogonal. The 
three highly intercorrelated clusters appear to 
be subclusters which tap three major aspects 
of a general sense of well-being. Each of the 
clusters will be presented with the defining 
items only, the oblique factor coefficients and 
the keying in the Es scale direction.* 

Cluster 1, Emotional well-being or freedom 
from disabling anxiety and depression. This 
cluster contains nine defining items. The dis- 
tribution of these items among the clinical 
and special MMPI scales tends to confirm the 
interpretation given to the scale based on the 
item content, Eight of these nine items ap- 
pear in Welsh’s A factor scale. As for the 
standard MMPI scales, four of the items 
appear in Pf, three in Si, four in no scale. 
Some of the items overlap with a number 
of scales such as D, Pd, Hy, Mf, and Sc and 
several items appear in Edwards’ SD scale. 
When these items are viewed in relation to 
their distribution in Barron’s rational clusters, 
eight of these nine appear in two categories: 
four items appear in Psychasthenia and 
Seclusiveness and another four items are 
found in his category of Personal Adequacy. 

This cluster has the highest alpha relia- 
bility (.86) of all the clusters as well as the 
highest generality. Following are the defining 
items of Cluster 1: 


*Factorially all the items emerged keyed in the 
same direction as Barron’s Es scale with one excep- 
tion: Item 95 which appears in Cluster 4 is scored 
false (F) factorially, but true (T) in Barron's scale. 
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MMPI Factor 
number coef. Item 

555 .72 T sometimes feel that I am about to 
go to pieces. (F) 

236 .71_ I brood a great deal. (F) 

94 .69 Ido many things I regret afterwards 

(Iregret things more or more often 
than others seem to). (F) 

544 .68 I feel tired a good deal of the time. 
(F) 

217 .67 I frequently find myself worrying 
about something. (F) 

32 59 I find it hard to keep my mind on a 

task or a job. (F) 

389 .55 My plans have frequently seemed so 
full of difficulties that I have to 
give them up. (F) 

384 .55 I feel unable to tell anyone all about 
myself. (F) 

359 .54 Sometimes some unimportant 


thought will run through my mind 
and bother me for days, (F) 


Cluster 2. Cognitive well-being or freedom 
from disabling primary-process thinking. This 
cluster contains five defining items whose con- 
tent deals explicitly with psychotic thoughts 
and experiences. The items are relatively 
homogeneous considering the fact that this 
small number of items produced an alpha 
reliability coefficient of .70. As might be ex- 
pected, the majority of the items appear in 
the Sc scale of the MMPI. Two of the five 
items appear in Barron’s Sense of Reality 
dimension. Following are the items: 


MMPI Factor 
number coef. Item 
349 64  Thavestrange and peculiar thoughts. 
(F) 
559 56 I have often been frightened in the 
middle of the night. (F) 
241 54 Idream frequently about things that 
are best kept to myself. (F) 
33 .53 [have had very peculiar and strange 
experiences. (F) 
244 46 My way of doing things is apt to be 


misunderstood by others. (F) 


Cluster 3. Physical well-being or freedom 
from physical complaints. There are six de- 
fining items which compose this cluster. All 
of the items relate to physical health in con- 
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tent. The reliability of .79 and a generality 
of .44 are surprisingly high for such a small 
number of items. The distribution of these 
physical items among the MMPI clinical 
scales discloses that almost all of them occur 
in the triad of Hs, D, and Hy scales which 
is known for its high complement of physical 
symptom complaints. Following are the items: 


MMPI Factor 
number coef. Item 
51 —.69 Iam in just as good physical health 


as most of my friends. (T) 


153 —.63 During the past few years I have 
been well most of the time. (T) 
189 -61 I feel weak all over much of the time. 
Œ) 
36 —.61 Iseldom worry about my health. (T) 
43 .57 My sleep is fitful and disturbed. (F) 
62 .51 Parts of my body often have feelings 


like burning, tingling, crawling, 
or like “going to sleep.” (F) 


Cluster 4. Religious attitude of nonbelief 
and nonparticipation. This small cluster of 
four defining items has a reliability of .63 
as well as the lowest generality of the clus- 
ters. All items have obvious religious content, 
and these deal on the one hand with belief 
in Biblical miracles and prophecies and, on 
the other, with one’s religious activity and 
participation. Two of the items appear in 
the D scale, the only one of the standard 
scales, All four are listed under Barron’s 
Religious dimension. Following are the items: 


MMPI Factor 
number coef. Item 
488 -63 I pray several times every week. (F) 
483 «55 Christ performed miracles such as 
changing water into wine. (F) 
95 +52 I go to church almost every week. 
(F) 
58 48 Everything is turning out just like 


the prophets of the Bible said it 
would. (F) 


Cluster 5. Seeking heterosexual stimulation 
and escape from boredom. This cluster with 
six items fares no better than the religious 
cluster in both its reliability and generality. 
Four of the items deal directly with sexual 
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content. These six items are not concentrated 
in any one of the standard MMPI scales 
although two of them appear in Si. All six, 
however, are found in Barron’s Moral Posture 
dimension. Following are the items: 


MMPI Factor 


number coef. Item 

208 —.64 I like to flirt. (T) 

231 —.51 I like to talk about sex. (T) 

548 «41 I never attend a sexy show if I can 
avoid it. (F) 

181 —.41 When I get bored I like to stir up 
some excitement. (T) 

410 —.36 I would certainly enjoy beating a 
crook at his own game. (T) 

430 —.30 Iam attracted by members of the 


opposite sex. (T) 


Table 1 presents the intercorrelations, reli- 
abilities, and generalities of the five main 
clusters of the Zs scale, As stated earlier, the 
first three clusters are highly oblique to each 
other with correlations as high as .76. As a 
result, a hierarchical condensation of the first 
three clusters was performed and the results 
are presented in Table 2. The new cluster, 
called sense of well-being, consists not only 
of the total 20 defining items which compose 
the three separate well-being clusters, but 
also an additional eight items which have 
sufficiently high communality as well as con- 
tent relevance to the combined cluster.’ The 
reliability of this newly condensed cluster is 
a respectable .90 and its generality is .66. 

As an attempt to explore the relevance of 
each of these dimensions to the concept of 
ego strength, mean difference tests were per- 
formed between the subsamples. Tables 3 
and 4 show the comparisons between groups. 
Although the P. Schiz. subjects do not differ 
from the O. Schiz. on any of the six com- 
parisons, each of these two subgroups as well 
as the combined T. Schiz. group relate dif 
ferently to the Anx. neurotic group. The P. 
Schiz. group has significantly larger means 
than the Anx. on four of the six comparisons, 
whereas the O. Schiz. subjects are not signifi- 
cantly different from the Anx. subjects on any 


5 The additional eight items are as follows: 2, 22 
100, 187, 192, 234, 344, and 421. 
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TABLE 1 
INTERCORRELATIONS, RELIABILITIES, AND GENERALITIES OF THE Five ES ITEM CLUSTERS 


Cı C2 Cs Ca Cs 
Cluster Emotional Cognitive Physical Religious Hetero- Generality 
sexual 
Cı Emotional well-being (.86) „51 
C2 Cognitive well-being .67 (.70) 37 
C; Physical well-being 76 48 (.79) 44 
C, Religion —.06 24 15 (.63) 14 
Cs Heterosexuality 09 —.01 32 18 (.60) AS 


Note.—Correlations based upon the defining items only. The diagonal values in parentheses are the alpha reliabilities, The 
generality value of each cluster represents the proportion of overall communality exhaustion produced by each dimension alone. 


of the comparisons. Clusters 1, 2, and 3, the 
well-being clusters, as well as the combined 
or condensed cluster of all three, show con- 
sistent differences between the normal and 
the two clinical groups. However, the differ- 
ences on these well-being clusters are not as 
consistent between the abnormal subgroups. 
As noted above, the O. Schiz. and Anx. groups 
do not differ on any clusters, while the 
P. Schiz. subjects show greater well-being on 
the emotional and physical clusters than the 
Anx. subjects. 

The religious and heterosexual clusters are 
also inconsistent between subgroups. On 
religion, the P. Schiz. patients are the ones 
who show differences, both with the Anx. and 
normal subjects. Interestingly, these para- 
noids show a significantly greater negative 
religious attitude than the Normals which is 
contrary to Barron’s findings. As for the 
heterosexual cluster, the Normals are con- 
sistently higher than the abnormal groups, 
but the clinical subgroups are not different 
from each other. 

As a cross-check on these findings, a sample 


of hospitalized medical patients free of psy- 
chiatric disorders and a group of general 
psychiatric patients in a mental hygiene 
clinic were scored on the five clusters as well 
as on the combined first three. Table 5 shows 
that the well-being clusters continue to be 
significantly different between nonpsychiatric 
and psychiatric subjects. It is interesting to 
note, but not necessarily unexpected consider- 
ing the nature of the control group, that on 
physical well-being the significant level is 
only at less than .10. However, the religious 
and heterosexual clusters are not significantly 
different. Thus the well-being clusters show 
consistent findings across the original and 
replicated samples, but not so for Clusters 4 
and 5, religion and heterosexuality. 


DISCUSSION 


Five clusters emerged from the analysis of 
the Es scale. It is of interest to note that 
these dimensions are similar in varying de- 
grees to five factors in the Crumpton, Cantor, 
and Batiste (1960) study. These factors were 
described as follows: religious attitude; anx- ( 


TABLE 2 
INTERCORRELATIONS, RELIABILITIES, AND GENERALITIES OF THE THREE Es Item CLUSTERS 
rb i Hete Generalit 
j d ee = 
Cluster Well-being Religion H ero- y 
i -66 
Cı (1+2+4+3) Miser 3 Oo») (2 z 
Ca (4 eligion | f : . 
C: S) Heterosexuality 415 18 (.59) 9 


gene Note: Correlations based upon she ldesnoe 
The Taber E under the cluster column represent 


items only. The diagonal 


values in parentheses are the alpha reliabilities. The 
Se mimallty $ exhaustion produced by each dimension alone. 


the proportion of ae the cluster number based upon the five clusters. 
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TABLE 3 


MEANS AND STANDARD DEVIATIONS FOR ORIGINAL SUBGROUPS 
ON THE ITEM CLUSTERS OF THE ES SCALE 


Total sample Normals Anx. Tot. Schiz. P. Schiz. O. Schiz. 
(N=310)  (N=90) (N=150) (N=70) (N=32) (N=38) 
Cluster Mano Meo): MSD ww SD uM SD M SD 


Cı Emotional well-being (9) 4.77 3.01 7.76 1.68 3.37 240 3.94 2.86 4.50 2.83 3.47 2.84 
Cə Cognitive well-being (5) 3.37 149 401 1.14 3.11 159 3.11 145 3.06 148 3.16 1.44 
Cs Physical well-being (6) 3.64 1.63 4.90 0.77 2.97 1.68 3.46 142 3.56 1.39 3.37 1.46 


Ca Religion (4) 2401.27 2199131 2.37 1.25 2.75 1.24 2.94 1:13 2.58 1,31 
Cs Heterosexuality (6) S00 odors 4:23, 1.93, 13.399 9,55) 93.43" 1:52 93.50 1.48 3.37 1.57 
Ci4242 Well-being (28) 17.68 6.51 23.93 3.37 14.55 5.68 16.33 5.67 17.09 6.07 15.68 5.32 


Note.—C1424s contains 28 items, 8 more than the sum of the items of the first three clusters. These additional eight meet the 
lower bound criterion of communality to be included here but were excluded from the individual clusters due to almost equal factor 
coerce A ste two or all three of the first three clusters, The value in parentheses beside the cluster name is the total number 
of items in cluster. 


TABLE 4 


Comparison oF NORMAL, ANXIETY, AND SCHIZOPHRENIC SUBGROUPS ON THE 
ITEM CLUSTERS OF THE ES SCALE BY MEANS oF $ Ratios 


3 Total Total 
7 P, Shiz. P, Schiz. P. Schiz. o. Schiz. O. Schiz. Anx. Schiz. Schiz, 
Cluster Items O. Schiz, Anx. Norm, © Anx. Norme Norm Ane. Norm. 
Ci Emotional well-being 9 1.510 2.333** —7.759* 0.222 —10.612®* —15,221%" 1,540  —10.538*** 
Ca Cognitive well-being 5 —0.272 —0.144  —3.734*** 0.180 — 3.572" — 4.712%" 0,034 — 4.386%" 
Ca Physical well-being 6 0.566 1.876 6,727" 1,352 — 7.766} —10,308* 2,118% — 8,225*Ħk 
C4 Religion 4 1213 2361  2.863*** 0.900 1.536 1.088  2.055** 2.7140 
Cs Heterosexuality 6 0.359 0.357 —2.349%*  —0,089 — 2,903 — 4,093" 0,158  — 3,312" 
Ci4243 Well-being 28 1.036 2,271* —7.850®* 1.111 —10,556"" 14.240 2.161%" —10,552"* 
* 
BiR 
wek p L.O. 
TABLE 5 
Means, STANDARD DEVIATIONS, AND / RATIOS FOR REPLICATED 
SUBSAMPLES ON THE ITEM CLUSTERS OF THE ES SCALE 
Mixed 
Total sample Medical psychiatric 
(N = 200) (N = 100) (N = 100) 
Cluster M SD M SD M SD ġ ratios 
Ci Emotional well-being 5.40 2.70 6.91 1.75 3.88 2.65 9.530*** 
Ce Cognitive well-being 3.88 1.13 4.26 0.84 3.49 1.26 5.094*** 
Cs Physical well-being 3.29 1.54 3.48 1,41 3.10 1.65 1.752* | 
Cy Religion 3 2.44 1.29 2.44 1.34 2.44 1.27 0 
Cs Heterosexuality 3.41 1.49 3.46 1.49 3.35 1.51 0.519 
Ci4243 Well-being 18.84 5.33 21.42 3.90 16.26 5.35 7.790*** 
* p<.10, 


+ p <.01. 
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ious, ruminative, worrying, obsessive, dis- 
tractible feelings or behaviors; physical las- 
situde, somatizing and tension; active and 
disturbing fantasy life; and heterosexuality. 
Thus even with the use of a different factor- 
ing method as well as some of the limitations 
of their study noted earlier, the structural 
similarity of these factors and the clusters 
of the present study reflect a reliable finding. 

Not all of the clusters in the present study 
could be related empirically to the concept 
of ego strength. The first three clusters, those 
pertaining to different facets of well-being, 
showed consistent differences between the 
normal and psychiatric groups across the 
initial and replicated samples. Comparisons 
between the psychiatric subgroups in the 
initial sample fared no better than previous 
studies (Tamkin, 1957; Tamkin & Klett, 
1957; Quay, 1963). The O. Schiz. and Anx. 
subjects did not differ on any of the clusters 
while the P. Schiz. subjects showed greater 
emotional and physical well-being than the 
Anx. patients. When the three well-being 
clusters were combined into one dimension, 
the P. Schiz. patients again showed greater 
well-being than the Anx., while the Anx. and 
O. Schiz. patients did not differ. 

There are several possible explanations for 
these findings among the psychiatric sub- 
groups of the initial sample. The P. Schiz. 
subjects are often described clinically as 
Possessing a suspicious and defensive attitude 
in which they deny to themselves and project 
onto others undesirable attributes and con- 
flicts over impulse expression. It is therefore 
not unreasonable to assume that the P. Schiz. 
Subjects probably capitalized upon a major 
Weakness of many inventory type assess- 
ment instruments, namely, the social desira- 
bility response set. Judging from the results 
of this study, the P. Schiz. subjects endorsed 
the items in the direction of greater well- 
being more so than the Anx. subjects in both 
the emotional and physical but, interestingly 
enough, not in the cognitive area. This re- 
sponse set, if such was the case, was not 
Sufficient, however, to obliterate differences 
between the P. Schiz. subjects and the Nor- 
mals. The controls showed significantly higher 
Scores on each of the well-being clusters. 
This finding suggests that abnormal groups 
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such as the P. Schiz. may be limited in the 
extent to which they can utilize this response 
set. 

The absence of differences between the 
Anx. and O. Schiz. subjects may be a function 
of neurotics often attempting to make them- 
selves look quite sick and disturbed. It is 
not uncommon to find higher ranging MMPI 
profiles among neurotics than psychotics re- 
flecting a “cry for help.” It may be this 
tendency to present themselves in this dis- 
turbed light which obliterates differences be- 
tween the Anx. and O. Schiz. subjects. It is 
further suggested that the social desirability 
response set operating in opposite directions 
for the Anx. and P. Schiz. subjects rather 
than the differential in ego strength produced 
higher well-being scores for the latter group 
of patients (cf. Block, 1965, for a critical 
review of the response set issue). 

The fourth cluster, religious nonbelief and 
nonparticipation, did not consistently dif- 
ferentiate the subgroups and therefore ap- 
peared less related to the construct of ego 
strength. This cluster, however, did produce 
a singular finding in which an attitude of 
religious nonbelief and nonparticipation was 
characteristic of the P. Schiz. patients. The 
P. Schiz. group obtained significantly higher 
scores than both the Anx. and Normal groups. 
An explanation for this finding will require 
further investigation. 

The last cluster, seeking heterosexual 
stimulation and escape from boredom, was 
also inconclusively linked to the concept of 
ego strength. On the initial sample the nor- 
mals obtained significantly higher scores than 
each of the psychiatric subsamples. This 
difference, however, washed out with the 
replicated medical and psychiatric groups. 
These results on the heterosexual cluster may 
be a function of the particular control groups 
used. The military officers of the initial 

mple may constitute a unique group in 
relation to the heterosexual factor. The medi- 
cal group from the replicated sample may 
have scored unusually low as a function of 
their physical illness and debility. Another 
normal group is needed in order to further 
assess the relevance of this heterosexual 
cluster to ego strength. A 

Thus far the discussion has pertained to 
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the relation of each of the five clusters to 
the concept of ego strength based upon em- 
pirical findings of differences between cri- 
terion groups with purportedly greater and 
lesser ego strength. There is also the question 
of the conceptual relevance of the clusters 
to ego strength. 

As mentioned earlier, the construct ego 
strength has its roots in psychoanalytic 
theory. As the name implies it refers most 
broadly to the relative adequacy of function- 
ing of the ego system generally. The ego sys- 
tem is conceptualized as composed of a 
number of major subsystems. The adequacy 
of functioning of each of these major sub- 
systems as well as their interrelationships un- 
doubtedly constitutes a more differentiated 
approach to the assessment of ego strength 
(cf. Rapaport’s 1951, 1959 systematic treat- 
ment of psychoanalytic theory). The first 
three clusters or well-being dimensions, that is, 
emotional, cognitive, and physical (motoric), 
appear to have reference to similarly con- 
ceptualized major ego areas or substructures 
in psychoanalytic theory. The effectiveness 
with which these three areas function is 
dependent upon such factors as the control 
of potentially intrusive and disturbing im- 
pulses as well as the extent of energy avail- 
able to cope with stimulus situations in the 
external world. Similar conceptual linkages 
between the religious and heterosexual clus- 
ters and ego structures do not emerge as 
readily. 

In conclusion then, Barron’s scale is related 
to the construct ego strength conceptually 
and empirically only in part. Three clusters 
show empirical validity only in a gross sense: 
that is, when extreme groups such as psychi- 
atric and normal groups are compared. These 
same dimensions, however, lack validity for 
finer discriminations such as between abnor- 
mal groups. Similar findings from other 
studies cited above add a consistency to this 
conclusion. The suggested explanation was 
offered that the social desirability response 
set is operative in opposite directions for such 
abriormal groups as P. Schiz. and Anx. neu- 
otics thereby reversing expectations with 
espect to hierarchical ordering by ego 
trength level. The lack of difference between 
he Anx. and O. Schiz. subjects was ex- 
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plained principally by the Anx. presenting 
themselves in a negative light. Various valid- 
ity scales have been developed in an attempt 
to cope with the response set problem (cf. 
Edwards, 1954; Gough, 1947, 1952, 1957; 
Hathaway & McKinley, 1951; Meehl & 
Hathaway, 1946). Further research on the Es 
scale which would attempt to correct for 
response set might disclose an empirical valid- 
ity among abnormal groups which it now 
lacks. 
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INTERACTION PATTERNS IN CHILDREN WITH 
PHENYLKETONURIA * 
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To elucidate the role of emotional factors in PKU, 4 subject groups com- 
prised of PKU, retarded and/or brain damaged, psychotic, and normal chil- 
dren were compared on a measure of interaction behavior. On total interaction 
scores, the PKU group was found to perform significantly poorer than the 
normals, but significantly better than the psychotics. Differences between the 
PKU group and retarded and/or brain-damaged group tended toward signifi- 
cance, although on separate comparisons for the 3 social stimulus conditions 
the differences between these 2 groups were not significant. The PKU group 
was found to be the most heterogeneous, and the clustering of scores sug- 
gested that phenylketonuria is behaviorally not a unitary disorder. Correlations 
of intelligence criteria and interaction scores for the PKU group further indi- 
cated that the interaction measure may tap functions not assessed by standard- 


ized IQ tests. 


Since the initial description of this dis- 
order by Folling (1934), there have been 
a number of reports on behavioral deviations 
concomitant with phenylketonuria (PKU) 
(e.g., Bjornson, 1964; Jervis, 1954; Kaplan, 
1962; Karrer & Cahilly, 1965; Lyman, 1963). 
Most prominent of these is mental deficiency 
(e.g, Hsia, Knox, & Paine, 1957; Kaplan, 
1962). Organic brain damage or cerebral 
dysfunction, as diagnosed from EEG tracings, 
tremors, problems in coordination, etc., have 
also been mentioned as accompanying the 
biochemical disturbance. 

Disturbances in personality are cited as 
second in prominence to indications of mental 
retardation (Kaplan, 1962). The descriptions 
of observed symptomatology span the range 
from mild to moderate neurotic-like patterns, 
to those more characteristic of psychoses. 


1 This research was supported in part by Research 
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Child Health and Human Development, United 
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the Children’s Bureau, United States Department of 
Health, Education, and Welfare. 
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Jervis (1954), in reviewing 300 cases of 
PKU, has stated that 277 of these presented 
evidence of emotional disturbance. Further- 
more, the predominant patterns could be di- 
chotomized as being “hyperactive-destruc- 
tive” or “passive-apathetic.” Koch, Fishler, 
Schild, and Ragsdale (1964) noted that 5 
of the 30 PKU children involved in their 
investigation had originally been diagnosed 
as autistic and schizophrenic, and at the time 
of the study they evidenced excessive rock- 
ing, arm waving, and overall aimlessness. 
This entire group manifested erratic, hyper- 
active, and often unpredictable response 
tendencies. Woolley (1962, 1965) shows im- 
plicit recognition of two types of PKU chil- 
dren: retarded and psychotic. His rationale 
for these alternative outcomes is based on a 
theory which biochemically links PKU to 
psychosis and is dependent on whether the 
prescribed diet which is intended to reduce 
the level of phenylalanine in the blood is 
followed or not. A relationship between PKU 
and childhood psychosis has been suggested 
by Bjornson (1964) in the case of an 11- 
year-old PKU girl whom he had evaluated. 
Evidence of a thought disturbance resembling 
that observed in childhood schizophrenia was 
presented. From other cases, he conclud 

further that psychotic symptoms, including 
affective lability, catatonia, intense outbursts 
of fear, and autistic behaviors, may fre 
quently be associated with PKU. Additional 


162 


CHILDREN WITH PHENYLKETONURIA 


confirmation of similarities in behavioral 
aberrations among groups of PKU, function- 
ally psychotic, and organically psychotic 
children comes from a study reported by 
Yaker and Goldberg (1963). Using a check- 
list, they noted a significant frequency of 
behavioral communalities among the three 
groups. 

Reports on behavioral deviations have been 
based largely on clinical impressions and have 
tended to be unsystematic, while the more 
thorough investigations have generally in- 
volved a small number of children. In light 
of the current tendency to view PKU as a 
model for other disorders that are genetic in 
origin and which have metabolic abnormali- 
ties (Karrer & Cahilly, 1965), the need for 
systematic investigation of PKU on a be- 
havioral level appears to be a worthy en- 
deavor. This seems to be especially desirable 
in light of recent accounts in the literature 
(Bessman, 1964) which imply that PKU is 
not a unitary disorder with as predictable a 
course as was formerly believed. Rather, 
such children vary as to intelligence level, 
the presence and severity of emotional dis- 
turbance, response to dietary treatment, etc. 
(Kaplan, 1962; Karrer & Cahilly, 1965; 
Lyman, 1963). Worthy of note is that, as 
some investigators have cautioned (eg., 
Block, Jennings, Harvey, & Simpson, 1964), 
a group having a common symptom may 
nonetheless be heterogeneous in certain other 
respects, and such variability would obscure 
rather than clarify communalities. 

The present study was an outgrowth of 
some speculation regarding behavioral simi- 
larities between PKU and other diagnostic 
groups. Specifically, there seemed to be a 
need for some common, objective frame of 
reference relative to the symptomatology de- 
scribed and for a group of subjects suf- 
ficiently large and diverse to make the Tesults 
more representative of the population of 
PKU children. From these considerations, 
and from earlier pilot explorations of inter- 
personal and emotional disturbances in PKU 
(Friedman, Wood, & Steisel, 1966), the 
present approach was evolved. y 

As a procedure for measuring one dimen- 
sion of social or emotional behavior in chil- 
dren, especially tailored to the nonverbal 
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child, Steisel, Weiland, Denny, Smith, and 
Chaiken (1960)? developed a structured play 
situation involving a child and an experi- 
menter and which was geared to assess the 
child’s ability to relate to or interact with 
others. One significant aspect of emotional 
adjustment is this ability to relate to other 
persons and to objects. Characterizations of 
psychotic or autistic children ordinarily in- 
clude mention of their inability to form 
object relations and to respond appropriately 
to things or people in their environment, if 
they show attention to them at all. On the 
other hand, the retarded child is typically 
characterized as positively oriented toward 
others, interested in objects around him, but 
inept in coping with either, except on a 
primitive level. Moreover, the quality of the 
retarded child’s response to persons and ob- 
jects is quite unlike that observed in the 
psychotic and autistic. This first study estab- 
lished the reliability of the technique and 
provided a procedure for the training of 
judges. In a second study, Steisel, Weiland, 
Smith, and Schulman (1961) validated the 
procedure on groups of normal, retarded 
and/or brain-damaged, and psychotic chil- 
dren. The ability to interact for these three 
groups differed significantly from one an- 
other, and the results closely paralleled clin- 
ical impressions. That is, the interaction 
scores for the psychotics were lowest, those 
for the retarded and/or brain-damaged group 
were next lowest, whereas those for the 
normal controls were highest. 

The present study was designed to compare 
a heterogeneous group of PKU children on 
the described measure of interaction behavior 
with groups presenting related clinical symp- 
toms including retardation, brain damage, 
and autism. 


METHOD 
Subjects 
Comparisons were made on data from the inter- 
action situation on four groups of subjects: normals, 


retarded and/or brain-damaged (henceforth ab- 
breviated as “retarded”), psychotic, and PKU.* 


8 Copies of the rating scales, scoring criteria, and 
instructions for the experimenter are available from 


the authors. 
4The data on all the children in the retarded 


group are from the Woods Schools, on 18 psychotic 
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TABLE 1 
CHARACTERISTICS OF SUBJECTS OF Four Groups 


Subject PKU | Retarded | Psychotic | Normal 
Male 23 15 21 5 
Female 15 8 5 5 

Total 38 23 26 10 
Mean age 8.4 17 8.6 5.8 
Age range | 5.3-17.4 | 5.0-10.8 | 4.8-12.9 | 4.0-7.6 
SD* 2.39 1.46 1.65 1.17 


a Expressed in years, 


PKU subjects were selected from a population of 
outpatients routinely evaluated at a pediatric hos- 
pital. They were heterogeneous with respect to 
being on the special diet, the presence of signs of 
cerebral dysfunction (e.g., seizures, abnormal EEG 
tracings, etc.), and the presence and severity of 
symptoms of emotional disturbance. All PKU sub- 
jects had in common positive urine and blood as- 
says for phenylketonuria. These children were peri- 
odically evaluated as to intelligence and language- 
comprehension ability. Of the 38 subjects in this 
group, 30 were sufficiently responsive and coopera- 
tive to permit formal psychological testing. In these 
cases, either the Stanford-Binet (L-M) or Wechsler 
Intelligence Scale for Children (WISC) and the 
Peabody Picture Vocabulary Test (PPVT) were 
administered.5 For these 30 testable PKU children, 
the range of IQ scores was 38 to 117, the mean 
was 70, and the standard deviation was 18.10. The 
remaining eight children in this group were evalu- 
ated by means of the Vineland Social Maturity 
Scale. They were uniformly found to be nonverbal, 
unrelated, and severely retarded developmentally. 

Children comprising the retarded group were 
drawn from those in residence at the Woods Schools. 
The medical records on these children were screened, 


children from Eastern Pennsylvania Psychiatric 
Institute, and on 9 of the normals were those pre- 
viously reported by Steisel et al. (1961). The ex- 
perimental situation and procedure, method of train- 
ing judges, scale rating criteria, etc., remained the 
same for subjects previously and currently evalu- 
ated. The earlier raw data sheets on the above 
mentioned groups were obtained, scored according to 
the modification described to give equal weight to 
each of the three segments of the interaction situa- 
tion, and pooled with the data on the additional 
psychotics and the one normal, accordingly. 

5TIn cases where the WISC was administered, the 
Verbal Scale IQ was prorated from scaled scores on 
Information, Comprehension, Similarities, and Digit 
span subtests; the Performance Scale IQ was pro- 
ated from scaled subtest scores on Picture Comple- 
ion, Block Design, and Coding; the Full Scale IQ 
vas derived according to the standard procedure 
f combining the Verbal and Performance Scale 
cores. The PPVI was administered and scored in 
ccordance with the direction in the manual. 
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and a possible diagnosis of PKU was ruled out in 
all cases. The retarded group was heterogeneous 
with respect to suspected etiology of mental retarda- 
tion. Included in this group were children with con- 
vulsive disorders, cortical atrophy, mongolism, and 
the like. Intelligence estimates were available on 
all but one subject—two subjects were evaluated 
by means of the Gesell scales, the remainder re- 
ceived either the Stanford-Binet or WISC. The IQ 
scores ranged from 38 to 75, with a mean of 50,04 
and a standard deviation of 10.0. 

The psychotic children were drawn from two 
institutions, Eastern State School and Hospital and 
Eastern Pennsylvania Psychiatric Institute. These 
children were diagnosed as having childhood schizo- 
phrenia, infantile autism, symbiosis, and the like. 
Laboratory tests for PKU were performed on the 
eight children from Eastern State School and Hos- 
pital, and PKU was conclusively ruled out. Unfor- 
tunately, a determination of PKU had not been 
made for the 18 psychotic subjects from Eastern 
Pennsylvania Psychiatric Institute. However, neuro- 
logical findings via history, neurological examination, 
and EEG tracings were negative in 13 cases. Since 
untreated PKU children very frequently have posi- 
tive neurological signs (Eiduson, Geller, Yuwiller, & 
Eiduson, 1964), the probability of PKU in these 
children seemed very slight. 

Nine of the “normal” children were the offspring 
of friends and of the professional staff who were 
employed at Eastern Pennsylvania Psychiatric Insti- 
tute. Although the raters and the experimenter knew 
who these children were, they had had no prior 
personal contact with them. One of the subjects in 
the normal group was the sibling of a PKU subject. 

The breakdown of subject groups, with data on 
age and sex is presented in Table 1. 

As a precautionary measure, it was made certain 
that subjects in all groups were free from medica- 
tion for a period of 1 month from the time of 
evaluation. Moreover, for the PKU group, the inter- 
action situation was scheduled as the first item upon 
the child’s arrival at the hospital and was per- 
formed in a different building from the one in which 
the children received venipunctures for determination 
of blood phenylalanine level. 


The Interaction Situation 


The interaction situation was carried out in a 
playroom with a one-way screen and was bare of 
enticing, distracting objects. The furniture consisted 
of two chairs and a desk. A shelf, which was 7 
feet from the floor, 8 inches deep, and 6 feet long, 
was attached to the wall opposite the one-way 
screen, On it, in full view of the child, were the 
various toys that were available for use. The place- 
ment of the toys on the shelf, which made them 
relatively inaccessible to the children unless they 
had some assistance, was made in order to stimulate 
them to seek interaction with the experimenter to 
obtain what they wanted. 

The experimental procedure is divided into three 
parts: (a) interaction is solicited by the experi- 


CHILDREN wiTH PHENYLKETONURIA 


menter, (b) interaction attempts made by the child 
are rejected by the experimenter, and (c) inter- 
action is not solicited by the adult, or rejected, but 
is awaited and responded to, The activity of the 
experimenter was carefully and specifically delineated 
in all three parts except for some minimal flexibility. 

There were four tasks in the first phase, during 
which interaction was solicited. The first one, a 
variation of simple tasks found on the Stanford- 
Binet scale, required the child to alternate stringing 
beads with the experimenter. In the second task the 
child took turns with the experimenter in throwing 
quoits at stakes. For the third task the two of 
them raced cars on the floor, while in the final one 
they alternately shot darts at a target. 

At the outset of the second period, during which 
time interaction was rejected, the child was asked 
which one of the toys he would like to have, and 
it was obtained for him. The child was then told 
that the experimenter was going to be busy for 5 
minutes and he then occupied himself by making 
notes or reading. Overtures from the child were 
ignored or rejected by the experimenter. 

After this period was completed, the third and 
final phase was started. Initially, the experimenter 
indicated his availability and asked the child what 
he would like to do. The child’s lead was followed 
by him, The entire procedure took about 20 minutes. 

There were two or more observers behind the 
one-way screen who recorded the child’s interactive 
efforts independently of one another. 

The rating scale had seven subparts: (a) paying 
attention to the experimenter or instructions (this 
is not scored during the rejection period); (b) pay- 
ing attention to the tasks or objects; (c) following 
instructions, cooperating, or complying (omitted 
during the time that interaction is rejected and while 
it is awaited); (d) initiating or instigating inter- 
action with the experimenter (not scored when 
interaction is solicited by the adult); (e) willingness 
and degree of investment in interaction; (f) com- 
municative sounds; and (g) response to the experi- 
menter’s interactive efforts (judges do not rate this 
variable during that portion of the situation where 
the child is being rejected). a 

Each of these categories is scored on a five-point 
scale which provides a rating, roughly, of the degree 
of pathology. A score of 1 reflects severe impairment 
while a score of 5 represents a maximum of the 
attribute being assessed. Scores of 2 or 4 are assigned 
When the child’s behavior tends in the particular 
direction but is not characteristically or consistently 
at the extreme, A score of 3 is given when the 
child fluctuates from one end of the continuum to 
the other without consistently being at either. 


Judges 


Judges were recruited from arene ue eee 
Personnel and represented such disciplines as psy- 
chology, psychiatry, social work, public health 
Nursing, medicine, etc. Potential judges were pro- 
vided with a manual describing the procedure to 
be followed and the method for scoring the scales. 
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TABLE 2 
OBSERVER AGREEMENT FOR PKU, NORMAL, 
RETARDED AND PSYCHOTIC GROUPS 
Group Range M Màn 
PKU 
Exact index 38.2-100% | 65.4% 64.7% 
Lenient index | 48.5-100% | 78.6% 79.4% 
Normal 
Exact index 63-95%, 80.22% | 84.05% 
Lenient index | 95-100% 97.11% | 97.0% 
Retarded 
Exact index 18-85% 53.63% | 62.0% 
Lenient index | 50-91% 73.78% | 76.5% 
Psychotic 
Exact index 30-88% 53.95% | 54.5% 
Lenient index | 51-100% 89.55% | 88.68% 


Following this preparation, judges attended several 
practice sessions and were permitted to discuss dis- 
crepancies in their ratings. When no further ques- 
tions regarding the mechanics or criteria for rating 
the scales were forthcoming, the experimental sub- 
jects were introduced. In subsequent sessions, poten- 
tial judges were invited to observe and rate the 
scales, and after the session to discuss their ratings 
with the more experienced judges. When no further 
unclarities were expressed with respect to the cri- 
teria for rating these scales, these staff members 
were subsequently included as judges. In the event 
that more than two judges were present at an 
interaction session and contributed ratings, two 
protocols were selected by a random procedure. 


RESULTS AND DISCUSSION 


After a method described earlier (Steisel 
et al, 1961), reliability was evaluated in 
terms of interjudge agreement on scale 
ratings. Two procedures were used. For the 
first, each instance of exact agreement 
(“exact index”) for the 34 ratings contrib- 
uted by the two judges was given a score 
of 1. All discrepancies in the ratings were 
not scored. The second method for scoring 
interjudge agreement might best be described 
as the “lenient index.” This was computed 
by assigning a score of 1 for each instance of 
exact agreement and a score of } for all 
other instances in which there was a dis- 
crepancy of one scale unit between the ratings 
contributed by the two judges. The ranges, 
means, and medians of these analyses, re- 
ported in terms of percentages for PKU, 


6OQne PKU girl who was observed and rated by 
only one judge was dropped from this analysis. 
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normal, retarded, and psychotic groups are 
presented in Table 2. 

In previous work with the interaction pro- 
cedure (Steisel et al., 1961), scores were ex- 
pressed in terms of all scale ratings. There 
were 24 ratings for the solicited part. This 
consisted of the four tasks, each of which was 
judged on six of the seven scales. Since the 
experimenter took the lead, the child was not 
rated on “initiating or instigating inter- 
action.” Four scales were rated for the 
rejected period. Those that were deleted 
were: “attention to the experimenter;” “fol- 
lowing instructions and complying;” and 
“response to the experimenter’s interactive 
efforts.” During the awaited portion six scales 
were used; all but “following instructions 
and complying” were rated. Hence, unequal 
weight was given to each segment in arriving 
at a total interaction score. To correct this, 
and thereby give equal emphasis to the three 
social stimulus conditions in the total inter- 
action score, a mean scale rating was com- 
puted separately for each of the three stimu- 
lus conditions. The three segment scores were 
then used to arrive at a mean total score. 
Using this modified method of scoring the 
interaction, an analysis of interjudge agree- 
ment on ratings for the PKU group was 
made. Each judge’s ratings were ranked from 
1 to 37 for the PKU group for total mean 


TABLE 3 


ANALYSIS OF VARIANCE OF INTERACTION SCORES FOR 
THE Four GROUPS 


Source of a 
variation af MS F 
Total—average 
interaction 
situation 
Between groups| 3 10.7 10.28*** 
Within groups 95 1.04 
Solicited 
Between groups| 3 11.096 9.098*** 
Within groups 95 1.22 
Rejected 
Between groups 3 5.52 4,42*** 
Within groups 95 1.25 
Awaited 
Between groups| 3 12.33 8174n 
Within groups | 95 1.51 


+ D <.01. 
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TABLE 4 


SIGNIFICANCE OF DIFFERENCES OF MEANS FoR Four 
GROUPS on TOTAL AVERAGE AND THREE SUB- 
PART SCORES OF THE INTERACTION 


SITUATION 
Score df t 
Total average 
Normal-retarded 31 2.01* 
Normal-psychotic 36 5:36%88 
Normal-PKU 46 2.94008" 
Retarded-psychotic 49 4.21944% 
Retarded-PKU 59 1,67* 
Psychotic-PKU 64 2.60** 
Solicited 
Normal-retarded 31 2.508% 
Normal-psychotic 36 4.95*6** 
Normal-PKU 46 2.33% 
Retarded-psychotic 49 3.507415 
Retarded-PKU 59 .34 
Psychotic-PKU 64 3.31m 
Rejected 
Normal-retarded 31 56 
Normal-psychotic 36 3:7002 
Normal-PKU 46 1.24 
Retarded-psychotic 49 3.340" 
Retarded-PKU 59 83 
Psychotic-PKU 64 2.54** 
Awaited 
Normal-retarded 31 2.16** 
Normal-psychotic 36 5.3819 
Normal-PKU 46 7.9700" 
Retarded-psychotic 49 3:83 
Retarded-PKU 59 1.02 
Psychotic-PKU 64 2.59** 
*.10 >p > 05, 
B 
weet > < 001. 


score, and for each of the subscores of which 
this was comprised. Spearman rank-order cor- 
relations (rhos) between the two judges were 
then computed. This analysis yielded a rho 
of .95 for the total mean score; .82 for the 
solicited mean score; .93 for the rejected 
mean score; and .92 for the awaited mean 
score. All of these are significant at or beyond 
$ < 001. 

All subsequent analyses of the data (which 
include the one PKU girl who had been 
dropped from the previous analyses) were 
based upon mean scale scores for the total 
and three subportions of the interaction pro- 
cedure. In order to determine whether the 
four groups differed with respect to inter- 
action behavior, an analysis of variance was 
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computed for the total mean scores as well 
as for each of the three social stimulus 
conditions. These results are presented in 
Table 3. 

As can be seen, significant differences were 
obtained for the total mean scores as well 
as mean scores for each segment of the inter- 
action situation. To determine whether the 
means for the four groups differed signifi- 
cantly from one another for the various 
analyses of variance, ¢ tests were employed. 
These results are presented in Table 4. 

The order of total mean interaction scores, 
from high to low, was as follows: normal, 
retarded, PKU, and psychotic. The ¢ tests 
among means indicated significant differences 
for all intergroup comparisons at or beyond 
the .05 level (two-tailed test) with two ex- 
ceptions. However, one of these, the differ- 
ence in mean total interaction scores between 
retarded and PKU groups, tended toward 
significance, yielding a < .10 (two-tailed 
test). The difference between the normal and 
retarded groups was significant at $ < .06. 

Examination of the variances on total 
interaction scores for the four groups revealed 
the PKU group to be the most heterogeneous, 
followed by the retarded group, followed next 
by the psychotic group. The normals were 
most homogeneous. 

The only nonsignificant difference among 
the four groups for the solicited phase of the 
interaction situation was between the PKU 
and retarded groups. This result suggested 
that under highly structured conditions, the 
performance of PKU subjects is similar to 
that of the retarded subjects, although both 
these groups differed significantly from all 
others, 

For the rejected segment of the interaction, 
the psychotic group was significantly different 
from all others. Except for this, all other 
Comparisons were nonsignificant. 

For the last phase of the interaction pro- 
cedure (awaited), all differences except for 
the comparisons of PKU and retarded groups 
Were significant at or beyond the .05 level 
(two-tailed test). 

While these findings would suggest that the 
PKU group tends to differ from the retarded 
group on total scores for the interaction pro- 
cedure, the heterogeneity of variance sug- 
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gests that within the PKU group there were 
subjects whose performance was character- 
istic of the modal type within each of the 
groups. Essentially, there were seven PKU 
subjects whose scores corresponded to the 
lowest scores of the psychotic group. There 
were also 20 youngsters whose scores were 
similar to the retarded and/or brain-damaged 
group. Finally, there were 11 youngsters in 
the PKU group who were unlike the retarded 
or psychotic children. 

To determine the relationship of measures 
of intelligence and language comprehension 
to interaction scores, Spearman rank-order 
correlations (rhos) were computed for all pos- 
sible combinations of the total and subpart 
interaction scores with the Stanford-Binet or 
the WISC and PPVT IQ scores on data 
obtained from the 30 testable PKU subjects. 
After a procedure developed by Kendall and 
described by Siegel (1956), the rho values 
were converted to ¢ values. The rhos between 
interaction scores and WISC IQ scores were 
uniformly low (e.g., total, .31; solicited, 35; 
rejected, .15; and awaited, .33). All ¢ values 
failed to reach significance at p < .05. The 
rhos for the intercomparisons of interaction 
data with PPVT IQ scores were uniformly 
higher than those for interaction and intel- 
ligence criteria (e.g, total, .51; solicited, 
.38; rejected, .34; and awaited, .50), and £ 
values for all intercomparisons except that 
between PPVT IQs and the rejected segment 
of the interaction situation were significant 
at or beyond p < .05. These findings would 
suggest that the interaction situation taps 
aspects of behavior which are perhaps dif- 
ferent from those attributes measured by tests 
of intelligence. Hence, the interaction measure 
appears to add another dimension to the 
clinical assessment of children. The relatively 
higher correlations between measures of lan- 
guage comprehension (PPVT) and interaction 
criteria are not surprising in that subjects, 
during the interaction procedure, are rated 
for verbal responsiveness and presumably 
would respond more appropriately and confi- 
dently as a function of better understanding 
of the experimenter’s directions and asides. 

Although such variables as blood-phenyl- 
alanine level, age diet started, independent 
measures of emotional disturbance, etc. may 
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very well be related to the measures em- 
ployed here, we elected to omit these analy- 
ses. This decision was based upon our aware- 
ness that dietary treatment and pediatric care 
were based in part upon the child’s ade- 
quacy of emotional adjustment, alertness, 
level of intelligence, etc., as well as the avail- 
ability of the special nutritional supplement. 
Therefore, any intervariable comparisons 
would be biased because of such selectivity. 
It is recognized that this would be a limita- 
tion that the disorder imposes upon behav- 
ioral studies where the subjects are children 
comparable to ours in age and medical 
management. 

The question could be raised that these 
results reflect a difference in the age of the 
groups rather than differences as a function 
of diagnostic classification. It will be recalled 
that the normals were on the average the 
youngest of the groups and also scored 
highest on the interaction procedure, whereas 
the psychotics and PKUs, the oldest of the 
groups, scored lowest. The comparison of age 
and interaction scores for the entire group 
would have been confounded with diagnosis. 
Therefore, to determine the relationship of 
age and interaction score, a Spearman rank- 
order correlation was computed for the 
ranked interaction scores and age data for 
the PKU group. The results were as follows: 
The correlation of age and (a) total mean 
score was —.04; (b) solicited mean score, 
—.01; (c) rejected mean score, .05; and (d) 
awaited mean score, —.02. All are nonsignifi- 
cant, and hence indicate that age and inter- 
action total or part scores are not significantly 
correlated for the PKU group. 

The data presented here confirm prior 
impressions that PKU children are behavior- 
ally heterogeneous. There is clearly a need 
for further study of personality variables in 
such children and the present investigation 
demonstrates that this is feasible. As other 
metabolic disorders related to PKU are dis- 
covered, and as early detection and treatment 
procedures become more uniform, it may be 
possible to relate specific indexes of inter- 
personal behavior to a wide range of physio- 
logical, biochemical, and behavioral factors. 
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PUNCTUAL AND PROCRASTINATING STUDENTS: 
A STUDY OF TEMPORAL PARAMETERS * 


SIDNEY J. BLATT anb PAUL QUINLAN 
Yale University 


Punctual and procrastinating students, selected on the basis of when in the 
semester they met a course requirement, were compared on a number of 
temporal parameters. The 2 groups consisted of males similar in age, educa- 
tion, CEEB scores, college grades, and extent of extracurricular activities. 
The groups did not differ significantly on authoritarian values, measures 
of divergent thinking, and general intelligence, but they did differ significantly 
on several measures of temporal parameters. Punctual Ss had greater future 
time extension in fantasy productions, reported less preoccupation with death, 
and did significantly better on the WAIS Picture Arrangement, a scale assumed 
to assess the capacity for anticipation and planning. There was also a trend 
(p < .10) for punctual Ss to have less interference on the Stroop Color-Word 
Test. These findings were discussed in terms of the role of temporal organiza- 
tion in personality and adaptation, and as indicating that the time of volun- 
teering for participation in research is an important variable which affects 


representative sampling. 


Time, as a fundamental rhythm expressed 
in part in the cyclical biological functions of 
the organism, is a basic dimension of reality 
to which man must accommodate. Psycho- 
logical maturity requires ever increasing ad- 
herence to, and integration of, temporal 
regularities, Time can be experienced as con- 
fining, or it can be used to regulate experi- 
ences, or it can give order, continuity, and 
purpose to existence. Man’s position vis-a-vis 
time, therefore, can reflect interpersonal 
mutuality and cooperation, & revolt against 
social regulation, and/or a passive adher- 
ence to external constraints (Fraisse, 1963). 
Temporal parameters, such as the capacity 
for delay and the ability to anticipate, plan, 
and understand means-end relations, are es- 
sential features of secondary process think- 
ing. Anxiety as a signal function can exist 
only with anticipation of possible future 
events. Delay and anticipation are essential 
components of the shift from the pleasure 
principle to the reality principle (Hartmann, 
Kris, & Lowenstein, 1964). Attitudes toward 
time and the use and experience of time, such 
as the relative capacity for delay, anticipa- 
tion, and planning, are basic features of 


1The research reported herein was supported 
through the Cooperative Research Program of the 
Office of Education, United States Department of 
Heath, Education, and Welware, Project No. 1931, 


“Nonintellectual Factors in Cognitive Efficiency.” 


general modes of adaptation or character 
style. 

The study of time as a psychological vari- 
able has been of three general types: atti- 
tudes toward time, time estimation, and time 
perspective. Two recent reviews (Fraisse, 
1963; Wallace & Rabin, 1960) provide excel- 
lent summaries of the research on temporal 
constructs. These reviews also summarize 
much of the literature on temporal experi- 
ences in psychopathology and clearly indicate 
the pivotal role that temporal constructs have 
in psychological functioning. 

The present study is concerned primarily 
with time perspective, that is, the capacity 
to relate current experiences to a historical 
past and to an anticipation of the future. Of 
particular interest in this study are the 
individual differences in the capacity for 
anticipation and planning and the relation- 
ship of these differences to other psycho- 
logical functions. Although time perspective 
has been studied through a variety of tech- 
niques, there is considerable agreement about 
the importance of time perspective in psycho- 
logical organization and development and its 
impairment in psychopathology (Arieti, 
1947; Du Bois, 1954; Roth, 1961; Roth & 
Blatt, 1961; Schilder, 1936; Wallace & 
Rabin, 1960). Time perspective as a psycho- 
logical parameter has a prospective and a 
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retrospective dimension, but it is particularly 
the prospective span which is essential for 
purposeful, goal-directed behavior. 

Individual differences in future time per- 
spective (FTP) and in the capacity for plan- 
ning and anticipation can be particularly 
vivid for the undergraduate instructor. Most 
students are punctual in meeting course re- 
quirements, but, invariably, there are a few 
who delay until the last possible moment and 
frantically rush to meet a deadline or request 
a time extension. This phenomenon offers an 
excellent behavioral criterion of the capacity 
for anticipation and planning which can be 
used to study individual differences in time 
management and to test assumed measures of 
future time perspective and the capacity for 
anticipation and planning. A comparison of 
punctual and procrastinating students would 
not only be valuable in studying the psycho- 
logical processes related to time management, 
but it could also raise an important methodo- 
logical issue about a potential source of indi- 
vidual differences in demographically similar 
subjects and stress the need, when using 
volunteers in psychological research, for con- 
trolling the variation in the time when a 
subject participates in an experiment. 


METHOD 


A requirement in the year-long introductory psy- 
chology course at Yale is to participate in a psycho- 
logical experiment for 2 hours each semester. Fifteen 
subjects (Ss) who, within the first week of the fall 
semester of 1963 had completed their requirement 
to participate in an experiment, were selected as 
the punctual Ss or “Early Volunteers” (EV) for 
the present study. One of these Ss was unable to 
participate, however, because of illness. Fifteen 
“Late Volunteers” (LV) were obtained during the 
last 2 weeks of the first semester by contacting 
students who had not yet made any arrangements 
to meet the course requirement for participation 
in a psychological experiment. Early and late volun- 
teers, selected on the basis of their behavior in 
meeting a course requirement during the first se- 
mester, were seen individually and in random order 
during the beginning of the second semester. All Ss 
were either in the freshman or sophomore class and 
had selected the psychology course to meet the 
requirement of an introductory course in the social 
sciences, 

To assess the general intellectual level of the two 
groups, EV and LV Ss were compared on the In- 
formation and Vocabulary subtests of the Wechsler 
Adult Intelligence Scale (WAIS) and on College 
Entrance Examination Board (CEEB) scores. The 
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WAIS Information and Vocabulary subtests were 
selected because they are untimed and have the 
highest correlation of all WAIS subtests with Full 
Scale IQ (Wechsler, 1955). Student records were 
available, so the two groups were also compared 
on college grades and number of extracurricular 
activities. 

In addition to these control variables, the two 
groups were compared on a variety of measures of 
temporal parameters including the Picture Arrange- 
ment (PA) subtest of the WAIS, story stems used 
to assess FTP (Barndt & Johnson, 1955; Wallace, 
1956), a death-concern questionnaire (Dickstein & 
Blatt, 1966) found to be related to FTP, and the 
Stroop Color-Word Test (Stroop, 1935). 

The WAIS Picture Arrangement subtest was of 
particular interest since Rapaport, Gill and Schafer 
(1946) suggested that this subtest, in requiring sub- 
jects to place cartoon frames in meaningful se- 
quences, requires a capacity for anticipation and 
planning. Rapaport et al. assumed that the capacity 
to understand the cause and effect relationships 
between a series of discrete pictures requires the 
capacity to anticipate from one moment to the next. 
When this capacity for anticipation is relatively 
impaired, each event occurs in isolation and there 
is little organization or continuity. 

The measurement of FTP is a relatively recent 
development and one of the early empirical studies 
in this area was that of LeShan (1952) who related 
FTP to social class. LeShan developed a procedure 
to measure FTP in spontaneously told stories, and 
this technique was modified into a story-completion 
procedure by Barndt and Johnson (1955) and by 
Wallace (1956). In the story-completion procedure 
four story roots are presented verbally (eg. 1. At 
3 o’clock one bright sunny afternoon in May, two 
men were walking near the edge of town... ). 
For each root Ss were asked to make up a story, 
and when the amount of time transpiring in the 
action of the narration was not clear, Ss were asked 
how much time had elapsed in their stories. Each 
of the four story roots was scored for FTP. Prior 
research (Kastenbaum, 1961; Wallace, 1956) indi- 
cated that Roots 1 and 2 are different from Roots 
3 and 4 in that they are more structured, the story 
being anchored at a particular point in time and 
involving an interpersonal situation. 

A death-concern questionnaire, developed in an 
earlier study (Dickstein & Blatt, 1966), is an eight- 
item questionnaire designed to assess the degree of 
preoccupation with death. Death concern was found 
to have a significant negative relationship to FIP 
and to performance on the WAIS PA subtest. 

The Stroop Color-Word Test (Stroop, 1935) was 
also included in the study because of an assumet 
relationship between the capacity for delay and the 
development of a time sense (Freud, 1920, 1925). 
Part III of the Stroop test has been considered 
as requiring the capacity to inhibit the overlearned, 
readily available, and compelling response to the 
printed word while attending to the color (Gardner, 
Holzman, Klein, Linton, & Spence, 1959; Stroop, 
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1935). EV and LV Ss were compared on all three 
parts of the Stroop test and on an “interference 
score” (Gardner et al., 1959). It was expected that 
LV Ss would be more distractable and, therefore, 
would have a higher interference score on the 
Stroop test. 

Though it was expected that the major dif- 
ferences between EV and LV Ss would be on 
temporal parameters, it seemed possible that the 
groups could also differ in their tendency to con- 
form. The LV Ss might be more actively defiant of 
limits and less tolerant of authority, while EV Ss 
could be more submissive and accepting of author- 
ity and authoritarian structure. As a first attempt 
to test this hypothesis, the groups were given an 
abbreviated form of the California F Scale. Also 
two of the Guilford measures of divergent thinking 
(Getzels & Jackson, 1962; Guilford, 1957) were 
given to the groups to test for differences in the 
tendency to think in unusual and unconventional 
ways, 

All testing was done by the same experimenter 
(E) in an individual testing session about 2 hours 
in length, EV and LV Ss were seen in random order 
and testing and scoring was done blind. All Ss were 
asked what they thought the study was about and 
None of them was aware of its nature. 


RESULTS 


EV and LV Ss were compared on a number 
of demographic variables and there were no 
significant differences between the two groups 
on the WAIS Vocabulary and Information 
subtests, the Mathematics and Verbal CEEB 
Scores, college grades, and the number of 
extracurricular activities that Ss participated 
in during the semester that they were selected 
for study. 

Table 1 presents the comparison of EV 
and LV Ss on measures of anticipation and 
Planning, future time perspective, and death 
Concern. The EV, as compared to the LV Ss, 
reported significantly less preoccupation with 
death, told stories which extended further 
into the future, and had significantly higher 
WAIS PA scores. It is possible that the dif- 
ference between the two groups on the PA 
Subtest was not a function of differences in 
the assumed capacity for anticipation and 
Planning (Rapaport et al., 1946), but that 

V Ss worked more rapidly and the differ- 
ences between the two groups were a function 
Of time bonuses, Examination of the PA per- 
formances, however, indicated that the sig- 
nificant difference between the two groups 
Was primarily a function of a greater number 
of incorrect sequences in the LV group and 
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TABLE 1 


A Comparison or EARLY AND LATE VOLUNTEERS ON 
WAIS SUBTESTS AND MEASURES oF DEATH 
CONCERN AND FTP 


M 
Measure E ATE ae ous t 
EV ENV 
(N = 14)| (N = 15) 
WAIS 
Information 14.64 14.80 0.10 
Vocabulary 15.14 1533 0.02 
Picture Arrange- | 12.79 10.20 2934ta 
ment 
Death concern 19.93 23.50 1.92** 
Future time per- 
spective* 
Story 1 11.64 18.13 3136%** 
Story 2 11.61 18.17 3.38*** 
Story 3 15.75 14.30 | —0.46 
Story 4 16.79 13.33 | —1.09 
Note.—The death-concern questionnaire was inad- 


rendy not given to 2 LV Ss, therefore the df for this measure 
* Scores on the story stems are based on ranks and, there- 
fore, the comparison of means is expressed in the unit normal 
deviate (z) rather than as a / test, 
*# p < .03 (one-tailed). 
*** p < .01 (one-tailed). 


only secondarily a function of time bonuses. 
The LV Ss had a total of 28 incorrect 
sequences as compared to only 9 incorrect 
sequences for the EV group, and this dif- 
ference was statistically significant (p < .05). 
In terms of points gained from time bonuses, 
the LV group received a total of 12 time 
bonuses as compared to a total of 20 for the 
EV group, but this difference was not sta- 
tistically significant. In addition, no points 
were lost in either group because of exceeding 
time limits. 

An interesting difference was noted in the 
performance of EV and LV Ss on individual 
items of the PA subtest. The EV Ss were 
generally more successful on each item of the 
PA subtest, but there was one reversal in this 
pattern and this occurred on the sixth item 
(the Flirt Sequence). The correct (four 
points) sequence on this item has the Little 
King in a car, seeing an attractive woman 
carrying a bundle, ordering his chauffeur to 
stop, getting out of the car and walking with 
the woman, carrying her bundle on his head 
(JANET). Wechsler (1955) gives part credit 
(two points) for two alternate arrangements, 
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TABLE 2 


A Comparison OF EARLY AND LATE VOLUNTEERS ON 
MEASURES OF DIVERGENT THINKING AND THE 
Srroop AND F SCALE MEASURES 


M 
Measure t 
EV LV 
(N = 14) | (N = 15) 
California F Scale 93.21 96.40 |0.40 
Guilford divergent 
thinking 

Unusual uses 30.43 29.80 [0.13 

Word association 42,93 42.53 | 0.28 
Stroop Color-Word Test 

Part I 38.53” | 39.33” | 0.37 

Part IT 54.15” 53.54” | 0.21 

Part III 93.82” | 102.68” | 0.94 

Interference score =5.12 +4.55 |1.60* 


Note.—One EV S was not given the Stroop test and, there- 
fore, the df for the Stroop is 26 rather than 27, 
*p <.07 (one-tailed). 


one where the woman walking is placed first 
in the sequence (AJNET) and the other 
where the woman is placed third in the se- 
quence after the Little King has ordered the 
chauffeur to stop (JNAET). Eight of the 
14 EV Ss gave a two-point sequence on this 
item, and seven of these eight two-point 
sequences had the woman placed third, after 
the King has told the chauffeur to stop 
(JNAET). In contrast, 6 of the 15 LV Ss 
gave a two-point answer to this item, but 
4 of the 6 had the woman placed first in the 
sequence (AJNET). Though EV Ss may have 
a more fully developed temporal organiza- 
tion and a greater sense of responsibility, the 
type of partial error on Item 6 of the PA 
suggests that they may also be a formal, 
controlled, somber, ascetic group, who prefer 
to postpone or even avoid pleasure and satis- 
faction. The LV Ss, in contrast, may be a 
More spontaneous, labile, or even impulsive 
group who seek fun and pleasure. 

Further support for the impulsivity and 
lability of LV Ss was seen in the Stroop 
Color-Word Test. As indicated in Table 2, 
the two groups did not differ significantly on 
Parts I, II, and III of the Stroop test, but 
the difference between the groups approached 
statistical significance (p = .07) on the “in- 
terference measure” (Part III/Part II). There 
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were no significant differences between the 
two groups, however, on authoritarian values 
or the capacity for divergent thought. 


DISCUSSION 


The results clearly indicated that there 
are major psychological differences between 
punctual and procrastinating Ss; these dif- 
ferences were found primarily in temporal 
dimensions, such as the extent of future time 
perspective, the capacity for anticipation and 
planning, and a resistance to distractibility. 
Punctual and procrastinating Ss also differed 
significantly on a questionnaire about the 
degree of concern about death. 

The significantly higher death-concern 
scores of the procrastinating Ss were con- 
sistent with earlier findings (Dickstein & 
Blatt, 1966) of a relationship between death 
concern and temporal parameters and with 
a recent report of the psychoanalysis of a 
patient with “chronic and intractable late- 
ness,” where one of the functions of the 
lateness was to ward ọff fears of death (Orgel, 
1965). Though the relationship between pro- 
crastination, death concern, and lower scores 
on PA could be a function of psychomotor 
retardation associated with depression rather 
than a primary relationship between death 
concern and time, the data of the study by 
Dickstein and Blatt (1966) do not support 
this interpretation. High death-concern 58 
gained only 10 points from time bonuses on 
the PA subtest as compared to a total of 
11 time-bonus points gained by the low 
group; no points were lost in either group 
for exceeding time limits. The significant dif- 
ference on PA between high and low death- 
concern Ss in this prior study was almost 
exclusively a function of the number of 
incorrect responses. 

The significant difference found between 
EV and LV Ss on several of the story: stems 
offered further support for the value of using 
fantasy productions in the study of tempor 
extension. Both the analysis of stories told to 
Thematic Apperception Test cards (Epley 
Ricks, 1963) or elicited by verbal ety 
(Barndt & Johnson, 1955; Wallace, 19 i 
have led to meaningful research in the ae 
of FTP. It should be noted, however, thal 
the significant difference found between 
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and LV Ss occurred only in stories told to the 
first two stems. In the present study, the cri- 
terion of punctuality versus procrastination 
was defined in terms of a specific time dead- 
line and within the interpersonal matrix of 
student and instructor. It is consistent, there- 
fore, that the differences in FTP between the 
two groups should occur on stories told to 
Stems 1 and 2, that is, the more struc- 
tured, interpersonal stems, In contrast, dif- 
ferences in FTP between high and low death- 
concern Ss occurred primarily on the less 
structured, noninterpersonal third and fourth 
stems (Dickstein & Blatt, 1966). 

The significant difference found between 
punctual and procrastinating Ss on the WAIS 
PA subtest supports the assumption that PA 
assesses, at least in part, a capacity for an- 
ticipation and planning (Rapaport et al., 
1946). The ability to anticipate from one 
event or moment to the next on the PA sub- 
test seems to assess a more general capacity to 
extend oneself into the future and to estab- 
lish a sense of continuity so that effective 
planning can take place. 

It is consistent that EV Ss, who have 
greater FTP and better anticipation and 
planning, should also experience relatively 
less interference and distraction on the Stroop 
test. The degree of distractibility on the 
Stroop is in part a function of the capacity 
to delay and inhibit the more immediate re- 
sponse to the printed words, so that the color 
of the ink can be named accurately and 
rapidly. Postponement or delay has been 
conceptualized as one of the early stages 
in the development of an understanding 
and utilization of time. It is only through 
delay of immediate gratification that one 
is able to make initial discriminations in 
reality and to develop a capacity for 
anticipation and planning which is neces- 
sary to strive toward temporally more dis- 
tant goals (Freud, 1911; Hartmann, 1958; 
Rapaport et al., 1946). The finding of the 
Present study, that punctual Ss with a more 
fully developed sense and use of time also 
tend to experience relatively less interference 
or distraction on the Stroop test, offers sup- 
Port for the conceptualization that the capac- 
ity to delay or inhibit an immediate response 
May be related to planning and anticipation. 
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The differences found in the present study 
between demographically similar EV and LV 
Ss indicate that the temporal sequence in 
which Ss are obtained for research can be a 
confounding factor in what initially appears 
to be a relatively homogeneous sample. Re- 
search conducted early or late in the semester 
seems to sample significantly different popu- 
lations, and ambiguous and even contradictory 
findings might be obtained, particularly if the 
variables of the study are related to temporal 
parameters. Similar methodological issues 
have been raised about sampling bias created 
by Ss who miss appointments or refuse to par- 
ticipate in a study (Abeles, Iscoe, & Brown, 
1954; Frey & Becker, 1958; Martin & Mar- 
cuse, 1958). The results of the present study 
indicate an even more subtle source of sam- 
pling bias and again raise the serious question 
about the advisability of using volunteers in 
relatively uncontrolled ways. Campbell and 
Stanley (1963), in discussing the problem of 
the representativeness of volunteer Ss in re- 
search, commented that “early volunteers are 
a biased sample, and the total universe ‘sam- 
pled’ changes from day to day as the experi- 
ment goes on, as more pressure is required to 
recruit volunteers, etc. [p. 194].” Though 
random assignment of volunteers to treatment 
groups may equate treatment groups, random- 
ization does not resolve the issue of repre- 
sentative sampling (Campbell & Stanley, 
1963). The results of the present study indi- 
cate that a wide range of psychological func- 
tions are related to the time of volunteering 
and clearly stress the need to be concerned 
about the issue of representative sampling 
when volunteers are used in research. 

Thus far in this paper punctuality and 
procrastination have been viewed as highly 
stable and enduring character traits. Indi- 
viduals do seem to tend toward one pole or 
the other pole, but there is undoubtedly a 
considerable range in the consistency of an 
individual’s tendency toward punctuality or 
procrastination. Further research should be 
devoted to studying other aspects of punc- 
tual and procrastinating Ss (such as the de- 
gree of negativism in procrastinating Ss or 
the overconformity of punctual Ss), the effect 
of a variety of situational contexts (such as 
positive and negative experiences) on such 
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behavior, and the interaction between these 
individual and situational dimensions on tem- 
poral parameters. There are individuals who 
live primarily in the present and who see 
relatively little relationship between the pres- 
ent and their historical past or their future, 
and there are those who live in a complex 
temporal universe with continuity and pur- 
pose. These dimensions reflect fundamental 
qualities of an individual’s existence, and they 
should become increasingly important dimen- 
sions in psychological theory and research. 
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The present study shows that global, clinical judgments of improvement can 
be predicted by a linear combination of various elements of symptom reduc- 
tion. This runs counter to the contention of Meehl (1950) that configural, 
nonlinear combinations are necessary to account for clinical judgment. Of 
some pragmatic value is the fact that the prediction equation can form the 
basis of a composite improvement score whose elements of symptom reduction 
are weighted according to their relationship to the clinically face-valid rating 
of improvement. This composite is shown to be highly consistent between 
various drug treatments and between study samples on the same drug treat- 
ment. Moreover, it is shown to be a more sensitive discriminator between 
drug- and placebo-treated patients. The composite is offered as a more sensitive 


measure of improvement in acute schizophrenia. 


Interest in the nature of clinical judgment 
derives from two major sources, (a) those 
interested in the processes by which the cli- 
nician pieces together various aspects of a 
situation to form a judgment (Horst, 1954; 
Hunt, 1959; Lubin & Osburn, 1957; Mc- 
Quitty, 1957; Meehl, 1950) and (0) those 
interested in sensitizing the measures of clini- 
cal judgment so that these might be of greater 
value in studies of psychopathology (Lorr, 
Klett, & McNair, 1963; Spitzer, 1965; Wit- 
tenborn, 1962). 

In treatment-evaluation studies, the global 
rating of improvement rendered by the cli- 
nician has been used frequently. Although it 
has been severely criticized by Lorr (1963), 
among others, as not being a unitary char- 
acteristic in a factorial sense, it still has con- 
siderable appeal to clinicians because it is 
simple, direct, comprehensive, and face-valid. 
Moreover, its rater-agreement reliability and 
its validity in terms of discriminations be- 
tween drug and placebo patients are impres- 
sive (NIMH-PSC, 1964). Thus it would be 
of interest to ascertain what the clinical judge 
takes into account when rating global im- 
Provement. This might be of aid in the ulti- 


1 This study was supported by National Institute 
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04663, MH 04667, MH 04673, MH 04674, MH 
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stitute of Mental Health Contract No, SA-43-ph- 
064, 


mate development of an even more sensitive 
measure of improvement. 


RESEARCH QUESTIONS 


1. To what extent can clinical judgments of 
improvement be accounted for by a linear 
combination of various elements of symptom 
reduction, even though clinicians suggest that 
configural or nonlinear patterns are heavily 
involved in clinical judgments? The difficult- 
ies in configural analysis are aptly summa- 
rized by Wainwright (1965), 


the real drawback to configural analysis is the large 
number of parameters to be determined in the ab- 
sence of any quantitative indication that an advan- 
tage will be gained by so doing ... there may be 
some disadvantages. Greater subsequent prediction 
may be obtained with a linear model than with a 
configural one, even though the configural model 
must have as great or greater accuracy in the ex- 
perimental sample (See Lee, 1956; Lubin, 1954; 
Forehand & McQuitty, 1959.) [p. 8]. 


2. What are the various elements of symp- 
tom reduction in acute schizophrenia associ- 
ated with judgments of improvement (a) as 
based on interview behavior and (b) as based 
on ward observations? 

3. Can a composite of symptom-reduction 
scores be developed which is even more sen- 
sitive than the global rating of improvement? 


METHOD AND PROCEDURE 


The data presented in this paper were drawn from 
two separate studies sponsored by the National In- 
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stitute of Mental Health, Psychopharmacology 
Service Center. Both studies involved acutely psy- 
chotic, schizophrenic patients, In Study I, over 250 
patients were randomly assigned to three pheno- 
thiazine drugs; chlorpromazine (Thorazine), fluphen- 
azine (Permitil, Prolixin), and thioridazine (Mel- 
Jaril). In Study II over 450 patients were randomly 
assigned to acetophenazine (Tindal), chlorpromazine 
(Thorazine), and fluphenazine (Permitil, Prolixin). 
Thus, two of the drugs, chlorpromazine and fluphen- 
azine, were included in both studies. 

The general background of the project, the de- 
tails of the research design, and the characteristics 
of the samples in the hospitals have been published 
elsewhere by the NIMH-PSC Collaborative Study 
Group (1964, 1966). A summary is provided here 
for orientation. The following nine institutions par- 
ticipated in Study I: Boston State Hospital, Bos- 
ton, Massachusetts; District of Columbia General 
Hospital, Washington, D. C.; Kentucky State Hos- 
pital, Danville, Kentucky; Malcolm-Bliss Mental 
Health Center, St. Louis, Missouri; Mercy-Douglass 
Hospital, Philadelphia, Pennsylvania; Payne-Whit- 
ney Clinic, New York, New York; Rochester State 
Hospital, Rochester, New York; Springfield State 
Hospital, Sykesville, Maryland; and the Institute of 
Living, Hartford, Connecticut. In Study II all but 
Payne-Whitney Clinic participated. 
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After being randomly assigned to treatment, a 
patient in Study I completed 6 weeks of treatment 
and in Study II, 5 weeks of treatment—all in the 
hospitals. Double-blind conditions were maintained 
throughout. 

Clinical status of patients was evaluated prior to 
treatment and at posttreatment by a number of 
methods, including the following: (a) clinical as- 
sessment of general severity of illness and of degree 
of improvement by the psychiatrist, psychologist, 
and nurse; (b) the Inpatient Multi-Dimensional Psy- 
chiatric Scale (IMPS) developed by Lorr, Klett, 
McNair and Lasky (1963), completed by the psy- 
chiatrist or psychologist and based upon 1-hour 
diagnostic interviews; and (c) the Ward Behavior 
Rating Scale (WBRS) developed by Burdock, 
Hakerem, Hardesty, and Zubin (1960), completed by 
nurses on the basis of observation on the ward. 

Scales for specific aspects of psychopathology 
were derived by Goldberg, Cole, and Clyde (1963) 
from factor analyses of the IMPS and WBRS rat- 
ings made prior to treatment.? The factor titles of 
the IMPS and WBRS are given in Table 1. 


The Symptom-Reduction Predictors 


Symptom-reduction scores were calculated for 
each of the 21 factor scores derived from the IMPS 
and WBRS as follows: 


(posttreatment score — pretreatment score)-+ maximum possible score 


maximum possible score 


In effect, this is a proportional change score ar- 
ranged arbitrarily so that a lower value indicates 
greater symptom reduction in order to conform 
with the direction of the criterion scoring. 


The Criteria of Global Clinical Improvement 


At posttreatment the following rating was made 
by psychologists and psychiatrists on the basis of the 
patient’s interview behavior and by nurses and ward 
attendants on the basis of ward observations: “How 
much has the patient changed since entering the 
study?” This item was rated on a 7-point scale 
ranging from 1, “very much improved,” through 4, 
“no change,” to 7, “very much worse.” 

Although the wording of the item was identical 
for both raters, it was expected that their judgments 
would be based on different opportunities to ob- 
serve. A nurse, for example, might typically chat 
with a patient, but no interview would be conducted. 


RESULTS 


Multiple correlations and regression equa- 
tions were obtained for each drug treatment 
in each of the two study samples between the 
global-improvement rating as the criterion and 
the 21 symptom-reduction measures as the 


predictors. This procedure was carried out 
separately for the doctor’s and the nurse’s 
improvement ratings. 


2The analyses of IMPS factor scores reported in 
this paper are based on single raters rather than the 
average of multiple raters. Reliabilities in terms of 
intraclass correlations between raters are reported by 
Lorr et al. (1963), the median being about .79. This 
is probably a conservative estimate of the reliabil- 
ity, since raters from different hospitals were in- 
volved, and any differences in rating standards among 
the hospitals would contribute to increased error. | 

8Since the global rating was completed immedi- 
ately after the specific ratings by the same person, 
the question of halo effect must be considered. Since 
our basic research question asked what the rater 
took into account when rating globally, it is possible 
that the total basis for his ratings is a generalize 
and undifferentiated set known as halo. However, 
this were so, then the original factor analysis y 
the symptom ratings would not have shown more 
than one factor, and the multiple-regression equa- 
tions in the present results would not have shown 
more than one independent predictor. Moreover, 
since one of our tasks was to estimate improvemen 
as it is conceived clinically, we wanted to include 
halo if it were there. 
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Reproducibility between Study Samples 


Since chlorpromazine and fluphenazine are 
in both studies, it is possible to determine 
whether the pattern of symptom change as- 
sociated with global improvement (multiple- 
regression equation) is consistent from one 
sample of patients to another when treated by 
the same drug. The method for comparing 
multiple-regression equations is outlined by 
Williams (1959). 

The findings are that there are no differ- 
ences * between the two study samples with 
respect to the prediction equations for (a) 
the doctor’s improvement rating on chlor- 
promazine, (b) the doctor’s improvement rat- 
ing on fluphenazine, (c) the nurse’s improve- 
ment rating on chlorpromazine, and (d) the 
nurse’s improvement rating on fluphenazine. 


Reproducibility between Drug Treatments 


For this analysis, the regression equations 
for each of four drug treatments were com- 
pared. Since there were no differences between 
study samples on a given drug, the samples 
on a given drug from both studies were 
treated as one sample. 

For both the doctors and the nurses the 
differences among the four regression equa- 
tions are not statistically significant (see 
Footnote 4). It thus appears that the pattern 
of symptom reduction is not only consistent 
from one sample of patients to another when 
treated with the same drug, but also is con- 
sistent between different drug treatments. 


Symptom-Reduction Patterns 


The absence of significant regression dif- 
ferences permits combining samples from all 
drugs and both studies in order to obtain 
more stable estimates of regression weights. 
Table 1 presents the regression coefficients 


4 Summary tables of the analyses of variance test- 
ing the significance among the multiple-regression 
equations have been deposited with the American 
Documentation Institute. Order Document No. 9275 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton, D. C. 20540. Remit in advance $1.25 for micro- 
film or $1.25 for photocopies and make checks pay- 
able to Chief, Photoduplication Service, Library of 
Congress. 
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TABLE 1 


REGRESSION COEFFICIENTS OF SyMPTOM-CHANGE 
SCORES on GLOBAL RATINGS OF ĪMPROVE- 
MENT: ALL DRUGS 


Doctor’s Nurse’s 
Variable rating rating 
R = .632 R = 594 
IMPS 
Hostility 467** 177 
Disorientation 229 234 
Guilt 383* —015 
Auditory hallucina- 187 139 
tions 
Agitation and tension 502* 020 
Slowed speech and 5720 106 
movements 
Delusions of grandeur 151 156 
Indifference to 683*** Bome 
environment 
Incoherent speech —477* —102 
Pressure of speech 389* 160 
Ideas of persecution OT Biia 105 
Hebephrenic symp- —060 —012 
toms 
Nonauditory halluci- 041 —257 
nations 
Memory deficit 069 —127 
WBRS 
Social participation 251* 789*** 
Irritability 416** 71264" 
Self care 161 —005 
Appearance of sadness 077 264* 
Feelings of unreality 327s 442m 
Resistiveness —055 047 
Confusion 400** 475" 
Intercept — 2.046 —0.559 


ne 


Note.—N = 697. Change scores were calculated as follows: 
[(5 or 6 week symptom score — pretreatment score) -+ maxi- 
mum possible score]/maximum possible score. 
*p <.05, 
D> <.01. 
wee D <.001. 


based on the total sample of 697, and presents 
them separately for the doctor’s and the 
nurse’s improvement ratings. 

The multiple correlations (.632 and .594) 
indicate that a significant portion of the vari- 
ance in clinical judgments of global improve- 
ment can be accounted for by a linear combi- 
nation of symptom-reduction scores. How- 
ever, even though the multiple correlations 
are significant, their observed values are in- 
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flated to some unknown extent. Even if the 
observed size of the multiple 7’s were not 
biased, there would still be a sizable propor- 
tion of the criterion variance unaccounted 
for. It is possible that configural scoring 
would increase our predictive efficiency, but 
this has not been tested on these data. As one 
would expect, the doctors seem to be influ- 
enced more by symptom changes in the inter- 
view, while the nurses place greater stress on 
ward-behavior changes. 

Column 1 in Table 1 demonstrates that 
greater global improvement as rated by the 
doctor is associated with greater reduction in 
hostility, guilt, agitation and tension, slowed 
speech and movements, indifference to en- 
vironment, pressure of speech, ideas of perse- 
cution, (poor) social participation, irritabil- 
ity, feelings of unreality and confusion, but 
little reduction of incoherent speech. Appar- 
ently patients who show a reduction in inco- 
herent speech tend to improve less in a global 
sense. Incoherent speech seems to be behaving 
as a suppressor variable in that the correla- 
tion of its reduction against global improve- 
ment is essentially zero (.074). 

Symptom areas without significant regres- 
sion coefficients should not be interpreted as 
showing no change under treatment. For that 
matter, in another analysis of these data, 
Goldberg, Klerman, and Cole (1965) show 
significant reduction of every one of these 
symptom areas from pre- to posttreatment 
when under drug treatment. 


Composite Improvement Scores 


The regression equations in Table 1 can 
form the basis for an improvement score 
which is a composite of symptom-reduction 
scores, weighted according to their relation- 
ship to globally rated improvement. Such an 
improvement composite would have the ad- 
vantage over other conceivable composites of 
being on the same scale as the global improve- 
ment rating whose scale points are described 
in face-valid, clinically meaningful terms 
(e.g., no change, much improved, etc.). Addi- 
tionally, the composite would have the psy- 
chometric advantage of being based on a 
large number of items instead of a single item, 
possibly leading to an increase in reliability 
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and validity." If the latter were established, 
the improvement composite could be offered 
as a more sensitive measure of improvement 
in acute schizophrenics for use in treatment 
evaluation or in other research requiring the 
measurement of improvement. 

The relative power in discriminating be- 
tween drug and placebo patients by ¢ test in 
our first study was taken as an indication of 
sensitivity. By now it is acknowledged as an 
established fact that phenothiazines do effect 
greater improvement than placebo in schizo- 
phrenic patients; we reasoned that the im- 
provement measure which discriminates drug 
from placebo patients with a larger ¢ value 
is the more sensitive one. Table 2 shows 
means, standard deviations, and ż’s for the 
doctor’s and nurse’s global ratings and im- 
provement composite scores. For both the 
doctors and the nurses the ¢ value between 
drug and placebo patients is higher for the 
improvement composite than it is for the 
global improvement rating, even though the ¢ 
value for the latter is quite substantial in it- 
self.” Note that the mean values for the global 


5 Since the composite is not based on a maximiza- 
tion of drug-placebo differences, it is possible for the 
composite to be more, less, or equally as sensitive as 
the global rating of improvement. 

€ Although in Table 2 the variability is signifi- 
cantly greater for the placebo than for the drug 
group for all measures, ¢ tests based on samples of 
such large size have been found to be robust with 
respect to departures from constant variance (Bo- 
neau, 1960). Typically, in generating a composite 
where weights are obtained to maximize the rela- 
tionship with a criterion, the degrees of freedom 
are reduced by the number of predictors. Subse- 
quent use of the composite in a drug-placebo con- 
trast which was not involved in developing the 
composite is typically viewed as not requiring a 
reduction in the degrees of freedom. 

™Part of the specific motivation underlying the 
construction of a more sensitive measure of improve- 
ment was our interest in the problem of predicting 
response to drug treatment by means of pretreat- 
ment symptom patterns, Before undertaking the 
prediction problem, we felt it necessary to develop 
a more sensitive measure of improvement, even 
though the single-item global rating was itself fairly 
sensitive. It is worth mentioning that, using the im- 
provement composite developed in this paper, we 
have been able to predict improvement from pre- 
treatment symptoms differentially in four drugs and 
have shown consistency of prediction in the same 
drug between two study samples (Goldberg, Matts- 
son, Cole, & Klerman, 1967). 


IMPROVEMENT IN SCHIZOPHRENIA 


tating and the composite are identical, but 
that the sigmas (and by implication the stand- 
ard errors of the means and differences) for 
the composite are smaller. 


Discussion 


The clinical judge has been represented by 
Meehl (1950) as one who arrives at his 
judgments by somehow combining relevant 
considerations in a nonlinear, configural fash- 
ion. In the past 15 years, a number of con- 
figural scoring methods have been suggested 
(Horst, 1954; McQuitty, 1957) as an out- 
growth of Meehl’s contention. The present 
study attempted to determine the degree to 
which one could account for the variance in 
clinical judgments of global improvement by 
means of linear combinations of aspects of 
symptom reduction, a procedure involving 
fewer and less complex assumptions than con- 
figural scoring. Our results indicate, by means 
of highly significant and consistent multiple 
correlations and regression equations, that we 
can do so, This result in itself would not deny 
the possibility that additional predictive effi- 
ciency might be obtained by configural meth- 
ods, but some doubt is cast on this possibility 
by the fact that the composite improvement 
score, based on the linear combinations of 
symptom reduction, is a more powerful dis- 
criminator between drug- and placebo-treated 
patients than the global judgment of improve- 
ment itself. If globally rated improvement 
contained configural elements representing 
valid improvement variance, the global rating 
ought to have been the more powerful dis- 
criminator. A 

Aside from questions on the nature of clin- 
ical judgment, there is an issue in studies of 
Psychopathology and treatment evaluation 
concerning the measures that should be used 
as indicators of improvement. Globally rated 
improvement was used in early studies because 
it is face-valid and simple to obtain. How- 
ever, the global rating was criticized as being 
factorially complex and as a glossing over of 
the various ways in which patients might 
change. Two patients equally improved in a 
global sense might have completely different 
Patterns of change in specific aspects of psy- 
chopathology. Moreover, in treatment-evalua- 
tion studies a treatment might affect some 
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TABLE 2 


RELATIVE Power oF CLINICAL COMPOSITE SCORE AND 
GLOBAL RATING OF ĪMPROVEMENT 


lini i i 
Rating Clinical composite Global rating 
o M o 
Doctor 
Placebo 3.5 1.08 3.5 1.49 
Drug 22 0.76 2.2 1.05 
t value 10.78 8.15 
Nurse 
Placebo 3.5 1.10 3.5 1.39 
Drug 2.3 0.70 2.3 1,06 
t value 9.88 7.60 


elements of psychopathology but not others; 
such a phenomenon might be obscured by use 
of a single rating of improvement. Out of this 
consideration, multidimensional scales based 
on factor analyses were developed to reflect 
relatively independent dimensions of psycho- 
pathology (Lorr et al., 1963; Wittenborn, 
1962). Significant as these contributions were, 
it meant that treatment-evaluation results had 
to be stated as many times as there were 
scales, thus rendering the results difficult to 
integrate and conceptualize. Lorr, for exam- 
ple, realized this difficulty and compromised 
by using three second-order factors, a pro- 
cedure with another set of difficulties. 

The point of view of the present authors is 
that despite the undeniable value of multidi- 
mensional scales, there is also a need for a 
single summary statement of a patient’s im- 
provement. The practice is often followed of 
obtaining total sum scores over a large num- 
ber of ratings of specific elements of psycho- 
pathology. This procedure is criticized by the 
clinician who asks, “How do you know that’s 
what I mean by improvement?” 

The procedure employed in this study an- 
swers the clinician’s question by saying, “I 
know this measure is improvement because it 
correlates well with your own direct judgment 
of improvement and even does a better job in 
discriminating patients we already know to 
improve differently.” In this respect, the im- 
provement composite developed here is of- 
fered as a more sensitive measure of improve- 
ment in treatment-evaluation studies. 

A final note of caution should be made to 
the effect that the present results may at best 
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be generalized to acutely ill schizophrenic 
patients. It seems likely that a completely 
different composite would be necessary for 
depressed or neurotic patients. 
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VERBAL ASSOCIATIVE STABILITY AND COMMONALITY AS 
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NEUROTICS, AND NORMALS + 


LOWELL H. STORMS, WILLIAM E. BROEN, Jr, anb IRWIN P. LEVIN? 
Neuropsychiatric Institute, UCLA Center for the Health Sciences 


4 associations to each of 16 stimulus words, 8 judged to be anxiety words and 
8 neutral words, were obtained under relaxed and time-pressure conditions 
from each of 40 schizophrenics, 32 neurotics, and 27 normals on 2 successive 
days. Schizophrenics and neurotics were significantly less stable than normals 
in their associations, and schizophrenics were significantly less stable than 
neurotics in their responses to anxiety words. Time pressure made schizo- 


phrenics even less stable and neurotics more stable. The associations of schizo- 


phrenics were more 
gave more uncommo; 


uncommon than those of neurotics or normals. All groups 
n responses when responding to anxiety words as com- 


pared to control words. The results of the experiment suggest that a partial 
disorganization of verbal habits is an aspect of schizophrenic thought dis- 


turbance, and the results are co) 
terpretation of this disorganization. 


It has been reported that schizophrenics 
produce more uncommon, idiosyncratic as- 
sociations than normals or other psychiatric 
patients (Deering, 1963; Johnson, Weiss, & 
Zelhart, 1964; Sommer, Dewar, & Osmond, 
1960) and that schizophrenics tend to be 
overinclusive and inappropriate in their think- 
ing, apparently because of associative in- 
trusions (Buss & Lang, 1965; Chapman, 
1958; Mednick, 1958; Payne & Hewlett, 
1960). The specific nature of the schizo- 
phrenics’ unusual associations has not been 
well defined. One possibility is that schizo- 
phrenics’ unusual associations are relatively 
stable, though private, reflecting defensive 
withdrawal or lack of involvement in mean- 
ingful communication. On the other hand, re- 
mote, uncommon associations may represent 
a breakdown in the schizophrenic’s associative 
organization, in which case instability of as- 
sociations would be expected. Comparing 
schizophrenics with normals on repeated ad- 
ministrations of the Kent-Rosanoff Word As- 
sociation Test, Sommer, Dewar, and Osmond 
(1960) found that schizophrenics were more 
unstable in their associations. Dokecki, Poli- 
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sistent with a response-strength ceiling in- 


doro, and Cromwell (1965) found that poor 
premorbid schizophrenics were both more un- 
common and less stable in their associations 
than good premorbid schizophrenics or tu- 
bercular patients. 

Such instability of associations in schizo- 
phrenics is consistent with the account of 
associative disorganization in schizophrenia 
which has been given by Broen and Storms 
(1961, 1964, 1966). This account utilizes the 
concept of arousal (drive) as an energizer of 
response hierarchies within a Hullian frame- 
work, and proposes as an additional concept 
a response-strength ceiling. In a situation 
with a number of competing responses, when 
the dominant response has reached ceiling as 
the result of increased drive, further drive in- 
creases raise the strengths of the competing 
responses, but not the dominant response. 
Thus, the probability of occurrence of the 
competing responses is increased. As a num- 


ber of response tendencies approach the ceil- 
ities become more 


ing, their response probabili 0 
nearly equal and partial randomization of the 
response hierarchy occurs, leading to unstable 
responding. The theory states that schizo- 
phrenics have lower average response-strength 
ceilings than normals (Broen & Storms, 1966) 
and are, therefore, more susceptible to ceiling 
effects. In this view, the thought disorder 
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which is a major criterion for the diagnosis of 
schizophrenia is due to the restriction of 
dominant responses by the lower ceiling. 
Since ceiling effects are displayed in the be- 
havior which leads to the diagnosis of schizo- 
phrenia, they should also be observed in 
standard experiments testing associations to 
familiar stimuli. Since the dominant responses 
of schizophrenics are already restricted by 
ceiling, increases in arousal will increase the 
strengths of responses lower in the hier- 
archy, but not the dominant response. Be- 
cause other responses have become more equal 
to the dominant response in strength, they 
will occur more often, and having a number 
of responses at nearly equal strength will re- 
sult in greater variability of responding. Nor- 
mals, whose dominant responses are less re- 
stricted by ceiling, should show less ceiling 
effects and less equalization of response 
strengths when arousal is increased. In the 
present study, we have attempted to extend 
the theory to neurotics. Our exploratory hy- 
pothesis is that neurotics have their dominant 
responses less restricted by ceiling than 
schizophrenics, and, therefore, that they 
should show less ceiling effects than schizo- 
phrenics and less equalization of response 
strengths when arousal is increased. 

This account predicts that the instability as 
well as remoteness of associations displayed 
by schizophrenics will be initially greater 
than that of other groups and will increase 
more with increases in drive. These are the 
primary hypotheses of this experiment. 

In order to obtain a measure of stability, 
an association test was administered on 2 suc- 
cessive days to schizophrenics, neurotics, and 
normals. Drive was varied in two ways: by 
using anxiety-producing and neutral words as 
stimuli and by presenting half the words un- 
der time-pressure conditions and half under 
relaxed conditions. Commonality was defined 
in terms of correspondence with normative 
data. It was hypothesized that schizophrenics 
would be more unstable and more uncommon 
in their responses than neurotics or normals, 
and that time-pressure conditions and anxiety 
words would increase the uncommon associa- 
tions and instability of schizophrenics, but 

not of the other two groups. 
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MeEtHop 
Subjects 


Twenty-seven normals, 32 neurotics, and 40 schizo- 
phrenics served as subjects (Ss). The neurotics and 
schizophrenics were psychiatric inpatients at the 
UCLA Neuropsychiatric Institute. Neurotics were 
defined by hospital diagnosis of any kind of psycho- 
neurosis and classification of their MMPI profiles 
as neurotic by the Meehl-Dahlstrom rules (1960), 
Schizophrenics were defined by hospital diagnosis 
of schizophrenia, classification of MMPI profiles as 
psychotic, and the judgment of the senior author 
(who did not run the subjects and had no access 
to the data prior to the judgments) that the MMPI 
configuration suggested schizophrenia rather than 
some other kind of psychosis. All psychiatric pa- 
tients who served as Ss did so within 4 days of 
admission to the hospital. Thus, those patients 
taking tranquilizing drugs (21 schizophrenics were 
receiving phenothiazines and 8 were on other drugs) 
had barely been started on drug therapy. The mean 
of the schizophrenics’ scores on the Phillips Prog- 
nostic Rating Scale (Phillips, 1953) was 16.5, 
SD =3.28 The normal Ss consisted of nurses, psy- 
chiatric technicians, and general medical and sur- 
gical patients from the UCLA Medical Center. The 
11 male and 16 female normals had an average 
age of 26.6 years and an average IQ estimate (from 
the Shipley-Hartford Vocabulary) of 127; the 15 
male and 17 female neurotics, an average age of 
38.5 years and an IQ of 124; and the 17 male and 
23 female schizophrenics an average age of 27.7 
years and an IQ of 121. 


Stimulus Word Lists 


Stimulus materials consisted of 13 of the cate- 
gory names from the Cohen, Bousfield, and Whit- 
marsh (1957) norms and 3 additional category 
names supplied by the experimenter so that there 
would be 8 anxiety-producing names to compare 
with 8 neutral names, An attempt was made to 
equate the neutral and anxiety-producing names for 
the frequency of the most frequent response in the 
norms (neutral mean = 306, anxiety mean = 301). 
However, it was not possible to match the fre- 
quencies of secondary and tertiary responses. These 
frequencies were lower for anxiety-producing names 
than for neutral names (secondary response means: 
239 and 220; tertiary response means: 198 and 151). 
Category names were called anxiety producing of 
neutral if 10 out of 12 PhD clinical psychologists 
at the UCLA Neuropsychiatric Institute agreed in 
their judgments that they would be anxiety pro- 
ducing or neutral in their effects upon patients. The 
names are presented in Table 1. 

Four lists were constructed. First, two lists wete 
prepared, each consisting of four neutral and 2i 
anxiety words. An anxiety word began one list aní 


3 Correlations between prognosis scores and as 
for instability and uncommonality were —.12 an 
.01, neither of them statistically significant. 
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TABLE 1 
CATEGORY NAMES 


Neutral Anxiety-Producing 
categories categories 
Birds Crimes 
Colors Disasters* 
Countries Diseases 
Flowers Parts of the human body 
Kinds of cloth + Poisons* 
Metals Snakes _ 
Sports Types of punishment* 
Vegetables Weapons 
‘Names supplied by the experimenter, not from norms. 


a neutral word the other, and anxiety and neutral 
words were alternated in each list. Two more lists 
were constructed in a similar manner using the anx- 
iety words from one of the original lists and the 
neutral words from the other. 


Procedure 


The Ss were tested individually in two separate 
sessions. They were instructed to give four examples 
of each category, writing them in the four blanks 
to the right of each stimulus word. One list of 
eight category names was presented under relaxed 
conditions and one list under the time-pressure con- 
ditions in which a large and loudly ticking Gralab 
timer was set on the table before S and he was 
told to work as fast as he could, Instructions for the 
relaxed and time-pressure conditions (for those Ss 
who received the relaxed condition first) were as 
follows: 


First List-Not Timed 


I am going to give you a sheet of paper... 
You will see the names of various kinds of things 
and after each one you will see four blank spaces. 
In the blank spaces, write four examples of the 
kind of thing given on that line. 


As an illustration, you might see the word planets. 
We want you to write four examples of planets 
such as Earth, Mars, Venus, and Saturn. If the 
word is jewelry, you might write ring, necklace, 
bracelet, earring. 


Remember you are to give four examples of each 
kind of thing. 

Work at your own speed. There are no time limits, 
So relax as much as possible while you work. 

Second List-Timed 
(When nonpressure condition was given first) 

This time you will be given another list, but it 
is important to work as fast as you can. The 


faster you work, the better your performance. 
We will time your responses with this timer. 
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Tve put it on the table so you can see how fast 
you are going. Remember to give four examples 
of each kind of thing as quickly as you can. 
We will be timing you in two minute intervals. 


Each S was presented two lists in each session. 
There was a rest period of 5 minutes between the 
two lists. Counterbalancing was done for the order 
of the relaxed and time-pressure conditions, which 
of the two pairs of lists was presented to the S, and 
which of the two lists was presented first. 

Each S was retested on the following day under 
the same conditions which had been assigned to him 
for the original test. He was told it did not matter 
whether or not he gave the same responses he had 
given on the previous day. 

The measure of associative stability was the total 
number of responses for each category name which 
were given on both testing occasions (number of 
repeats). The measure of uncommonality of associ- 
ations was obtained by giving a response a score 
of 1 if it occurred among the first four responses 
ranked according to frequency in the Cohen, Bous- 
field, and Whitmarsh (1957) norms, a score of 2 
if it occurred in the second four responses, a score 
of 3 if in the third four responses, and a score of 
4 if the response was any lower in frequency in 
the norms, These scores were summed over the 13 
categories from the Cohen, Bousfield, and Whit- 
marsh (1957) norms. The three categories supplied 
by the authors were not used in obtaining uncom- 
monality scores because norms are not available 
for them. 

Analyses of variance were performed using these 
measures for 3 (diagnostic groups) X 2 (drive condi- 
tions X 2 (kinds of words—anxiety producing and 


neutral). 
RESULTS 
Associative Stability 


Males and females did not differ on associ- 
ative stability (F < 1), and there was not a 


TABLE 2 
ANALYSIS OF VARIANCE For STABILITY SCORES 
Source df MS F 
JDEL 
Between 
Groups (B) 2 77.85 
Error (between) 96 10.19 
Within 
Categories (A) 1 56.4 14.0** 
Time pressure (C) 1 0.6 O15 | 
AXB 2 12.5 3.10 
BXC 1 9.5 2.36 
AXC 1 0.1 0.02 
AXBXC 2 6.2 1,54 
Error (within) 288 4.03 
*p <.05. 
+p <.01 
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significant interaction between sex and kind 
of stimulus word (F < 1). 

There was a significant difference among 
groups on the measure of associative stabil- 
ity (F = 7.64, df = 2/96, p < .01) as shown 
in Table 2. Both the schizophrenic and neu- 
rotic groups were more unstable than nor- 
mals (p < .01), but the schizophrenics were 
not significantly more unstable than the 
neurotics, although the trend was in that 
direction. Group means are presented in 
Table 3. All groups were significantly more 
unstable on the anxiety-producing categories 
than on neutral categories (F = 14.0, df= 
1/288, p<.01). There was a significant 
interaction between kinds of categories and 
groups, due to more marked effect of anxiety- 
producing categories on schizophrenics (F = 
3.10, df = 2/288, p< .05). Schizophrenics 
were not more unstable than neurotics on 
neutral categories, but were significantly 
more unstable on anxiety-producing categories 
(#=2.45, df=70, p<.02). While time 
pressure had no overall effect on associative 
stability, in a separate analysis its effects on 
schizophrenics and neurotics were significantly 
different. (Time Pressure X Groups’ Inter- 
action: F = 5.7, df = 1/210, p < .05). Under 
time pressure, associative stability decreased 
in schizophrenics and increased in neurotics. 
The direction of change was the same for 
neutral and anxiety-producing categories, but 


TABLE 3 


Group MEANS AND STANDARD DEVIATIONS OF UNCOM- 
MONALITY SCORES AND STABILITY (NUMBER 
OF REPEATS) FOR NEUTRAL AND 
Emorionat WORDS 


Uncommonality Stability 
Emotional Words- Emotional Words- 
Neutral Words Neutral Words 
Normals 
M 1.89% 1.73 22.12 23.07 
SD 92 84 3.55 3.91 
Neurotics 
M 1.89 1.71 20.53 21.00 
SD -90 84 2.05 3.16 
Schizophrenics 
M 2.04 179 18.03 21.17 
SD 92 86 4.67 2.46 


a Mean uncommonality score per stimulus word. 
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TABLE 4 
ANALYSIS OF VARIANCE FOR UNCOMMONALITY SCORES 
Source df MS F 
Between 
Groups (B) 2 | 1.025 6.29* 
Error (between) 96 -163 
Within 
Time pressure (C) 1 | 0.02 0.38 
Categories (A) 1 | 4.83 91.48** 
Days (D) +1 | 0.01 0.19 
AXC 1 | 0.10 1.89 
BXC 2 | 0.15 2.84 
cxD 1 | 0.01 0.19 
AXB 2 | 0.00 0.00 
AXD 1 | 0.14 2.65 
BXD 2 | 0.01 0.19 
AXBXC 2 | 0.08 1.52 
BXCXD 2 | 0.01 0.19 
AXCXD 1 | 0.04 0.76 
AXBXD 2 | 0.06 1.14 
Pooled AX BXCXD 
Error (within) 674 0528 
* 
=p Eo 


only for anxiety categories were the differ- 
ences significant in both groups (for schizo- 
phrenics, £= 2.02, df=39, p< .05; for 
neurotics, £ = 2.38, df = 31, p < .05). 


Commonality 


Males were significantly less common than 
females in their associations (F = 6.71, df = 
1/96, p < .05), but the interaction with word 
type was not significant (F < 1). The lower 
commonality for males should not affect the 
differences among diagnostic groups, since 
the proportion of males in each group is 
comparable (41% for normals, 42% for 
schizophrenics, 47% for neurotics). 

There was a significant difference among 
the groups on the uncommonality measure 
(F = 6.29, df = 2/96, p < .05) as shown in 
Table 4. The means for neurotics and nor- 
mals were almost identical, while a specific 
comparison revealed that schizophrenics were 
significantly more uncommon in their associa- 
tions than the other two groups combin 
(F = 12.4, df = 1/96, p < .001). All groups 
were more uncommon on the anxiety-pt0- 
ducing categories than on neutral categories 
(F=91.48, df= 1/674, p< .001). There 
were no significant effects of time pressure on 
uncommonality scores. 
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Correlation Analysis 


If instability and uncommonality of as- 
sociations both reflect disorganization of 
associative hierarchies, they should correlate 
significantly. An r=.32 (p< 05) for 
neurotics, an r=.36 for schizophrenics 
(p<.05), and an r=.56 for normals 
(p< .01) are consistent with this hypothe- 
sis, Although uncommonality and instability 
share some common variance, which may 
reflect disorganization of hierarchies, other 
important influences, such as idiosyncratic 
learning histories or a set to give original 
or unusual associations, may account for 
some of the substantial remaining variance. 


Discussion 


In this study, schizophrenics, when com- 
pared with control groups, were not only 
more unusual and idiosyncratic in their asso- 
ciations, as in previous research (Deering, 
1963; Sommer, Dewar, & Osmond, 1960), 
but were also more unstable from one day 
to the next. 

Schizophrenics were not only more unstable 
than normals, replicating a finding of Som- 
mer, Dewar, and Osmond (1960), but were 
also more unstable than neurotics on the 
anxiety categories. This is consistent with the 
view that a primary feature of schizophrenic 
associative disturbances consists of partial 
randomization of dominant and competing 
responses (disorganization of associative 
hierarchies) rather than (or in addition to) 
stable defensive reactions to stress. 

The instability of schizophrenics increased 
under time pressure and was greater in re- 
sponse to anxiety words, as hypothesized from 
an account presented by Broen and Storms 
(1961, 1964, 1966). This account states that 
the dominant responses of schizophrenics are 
restricted by lower ceilings, resulting in 
greater susceptibility to disorganization when 
stress is increased. Since normals and neu- 
rotics are considered to have less dominant 
response restriction by ceiling than schizo- 
phrenics, neither normals nor neurotics should 
show as much increase in instability as 


schizophrenics under time-pressure or anxiety 
eriment they did 


conditions, and in this exp 
not. 
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While we did predict the finding that 
schizophrenics displayed a greater decrease 
in stability under time pressure than neu- 
rotics (who actually showed increased stabil- 
ity), we cannot account for the lower stability 
in neurotics than in normals. No predictions 
were made comparing neurotics to normals, 
but this result was not anticipated. 

The lower commonality and lower stability 
of all groups on emotional words can be ac- 
counted for in terms of the differential as- 
sociative characteristics of the neutral and 
emotional stimuli, although the differences 
between schizophrenics and the other groups 
cannot be accounted for in such terms. It 
will be recalled that neutral and emotional 
words were equated for average normative 
frequency of primary response, but that the 
mean frequencies of secondary and tertiary 
responses were less for emotional stimuli 
than neutral stimuli. In addition, there was 
a larger number of different responses in the 
normative data for emotional words than for 
neutral words (means of 96,8 versus 60.4). 
Since the probability of secondary and terti- 
ary responses, which were included in the 
most common response class and assigned a 
normative score of 1, is lower in the norms 
for emotional words than for neutral words, 
the frequency of common responses would be 
expected to be Jess and that of uncommon 
responses to be greater for emotional words, 
as was actually observed in all three groups 
of Ss. The same effect would be expected 
on the basis of the apparently greater variety 
mses available for emotional words. 


of respo! i { 
In support of this interpretation, it was 


found that the frequency of primary re- 
sponses given by Ss in this study was at 
least as great for emotional as for neutral 
words. A study by Woods (1961), which 
found that schizophrenics gave more un- 
common responses to anxiety words than 
neutral words, used no control groups. The 
same result was obtained for normals and 
neurotics using most of the same stimuli in 
the present study, and was apparently due 

characteristics of the anx- 


to the associative S he anx 
iety and neutral stimuli. This result is simi- 
Jar to the finding of Johnson, Weiss, and 
Zelhart (1964) that both normals and psy- 
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TABLE 5 


PROPORTION OF REPEATS BY SUBJECT GROUP FOR 
EACH COMMONALITY CATEGORY 


Subject group 1 2 3 
Normals 

Emotional words | 85.88* 78.48 52.35 

Neutral words 83.40 73.54 59.05 
Neurotics 

Emotional words | 83.57 58.33 49.53 

Neutral words 80.07 67.14 38.30 
Schizophrenics 

Emotional words | 83.01 55.83 39.52 

Neutral words 78.33 67.04 47.20 


Note.—Since there were too few responses in Category 4 to 
provide stable frequencies, this category is omitted, 

* Figures represent the percentage of the first day responses 
in the designated commonality category which were repeated 
on the second day, 


chotics were more idiosyncratic in response 
to “bad” words than to “good” words. 

The fact that all groups were also more 
unstable in their responses to emotional words 
seems to derive from the lower commonality 
for emotional words. We have seen that there 
is a significant correlation between commonal- 
ity and stability across subjects. There is an 
even stronger relationship across stimulus 
words. Table 5 shows that when a high com- 
monality word was given as a response on 
the first day, it was repeated over 80% of 
the time in most cases, while low commonal- 
ity words were repeated less than half the 
time. Since high commonality responses to 
emotional words were repeated at least as 
frequently as those to neutral words, the 
lower overall frequency of repeats for emo- 
tional words appears to be due to the 
difference in commonality. 

However, normative associative character- 
istics cannot account for the observed group 
differences. Neurotics did not differ from 
normals in commonality, but were signifi- 
cantly more unstable. Stability was particu- 
larly low for emotional words among schizo- 
phrenics, whereas the lower commonality for 
emotional words was shared about equally 
by all groups. 

There are two findings in this study which 
are difficult to account for in terms of the 
Broen and Storms formulation, but neither 
is clearly inconsistent with their interpreta- 
tion, One of these is that high commonality 
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responses given on the first day were almost 
as often repeated by schizophrenics as by the 
other two groups (see Table 5). If lower 
response-strength ceilings in schizophrenics 
are one cause of their decreased stability, 
a number of response tendencies, including 
that of the dominant response, should be at 
or near ceiling. If this is true, when a domi- 
nant response happens to occur it will be less 
often repeated, because competing responses 
are more nearly equal in strength to the 
dominant response and will occur more often. 
The lack of evidence to confirm this expec- 
tation might be explained as follows: If a 
primary normative response was the domi- 
nant response in such a partially disorganized 
hierarchy and was given on the first experi- 
mental day, there were four chances for it 
to be repeated on the second day, because 
four responses were obtained for each stimu- 
lus. Because the dominant response would be 
at ceiling in such a situation, while competing 
responses may be only near the ceiling, the 
dominant response would be somewhat more 
likely to be repeated than a competing re- 
sponse when four opportunities are given. 
This would reduce the differences in stabil- 
ity between schizophrenics and other groups 
for responses of high normative frequency, 
because they are more likely to be dominant 
responses in individual response hierarchies 
(for discussions of response-strength ceiling 
effects, see Broen and Storms, 1961, 1966). 
Another difficulty for the Broen and 
Storms formulation is that the prediction 
that schizophrenics’ uncommonality would 
increase under time pressure was not con- 
firmed. Time pressure may serve as a drive, 
but it almost certainly has other compli- 
cating effects, such as leading to precipitous 
action with a reduction in mediating proc- 
esses. Depending upon the kind of task, 
various effects of time pressure have been 
obtained. Time pressure has led to more 
pathological responses on the Rorschach in 
normal Ss (Siipola & Taylor, 1952) and 
overinclusion on a sorting task by normals 
(Usdansky & Chapman, 1960). On the other 
hand, Horton, Marlowe, and Crowne (1963) 
found that normals gave more common word- 
association responses under time pressure 
than under relaxed conditions, possibly be- 


VERBAL ASSOCIATIVE STABILITY AND COMMONALITY 


cause the standard instructions used in ob- 
taining the norms (Russell & Jenkins, 1954) 
included an injunction to “work rapidly.” 
Since the norms used in the present study 
used instructions with less emphasis on speed, 
the time-pressure instructions may have failed 
to increase the similarity between testing and 
norm-gathering situations, which may account 
for the fact that the normals in the present 
study did not give more common associations 
under time-pressure than under relaxed con- 
ditions. 

Various complex effects of time pressure on 
commonality may have balanced out in the 
present study. However, the failure to influ- 
ence commonality did not prevent time pres- 
sure from increasing the associative instability 
of schizophrenics as compared with neurotics. 

It appears that the response-strength ceil- 
ing account (Broen & Storms, 1961, 1964, 
1966) offers one possible interpretation of 
some of the findings of the present study. As 
hypothesized from this account, associative 
instability and uncommonality correlated sig- 
nificantly in all groups. As predicted from 
this formulation, schizophrenics were more 
unstable and more uncommon in their associ- 
ations than neurotics or normals. As hypothe- 
sized, both time pressure and anxiety words 
were associated with greater instability in 
schizophrenics. 
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PERSONALITY INVENTORY SCORES 


RONALD WILCOX * 
Cochran Veterans Administration Hospital, St. Louis 


AND 


ALAN KRASNOFF 


University of Missouri at St. Louis 


The MMPI and Crowne-Marlowe scales were administered to 50 psychiatric 
inpatients who were grouped by levels of motivation for discharge and then 
randomly assigned to 1 of 2 testing conditions. For the “defensive” condition, 
the patients were told that the tests would determine their readiness for 
discharge; for the “routine” condition, they were assured that the test results 
would not affect their discharge date. The analysis of variance revealed sig- 
nificant main effects for the conditions of testing and levels of motivation on 
several of the standard and social desirability scales, From an analysis of 
interaction effects, it is concluded that Ss are more likely to dissimulate when 
the testing is perceived as a potential barrier to the attainment of their 


immediate goals. 


A recurring question in the literature on 
personality inventories concerns the possible 
effects of test-taking attitudes on test scores. 
Variations in the test-taking attitudes of sub- 
jects (Ss) have been viewed traditionally as 
a source of error variance which acts to 
reduce the validity of personality inventories, 
especially in settings where the consequences 
of the testing may be regarded as of immedi- 
ate importance to the S (Meehl & Hathaway, 
1946). More recently, however, test-taking 
attitudes or “response sets” have been de- 
scribed as an essential part of the response 
process intrinsic to any test situation. From 
the latter point of view, by incorporating this 
variable into a theory of test-taking behav- 
ior, knowledge of the S’s test-taking attitude 
can be used to enhance the validity of the 
assessment process (Crowne & Marlowe, 
1964; Rotter, 1960). 

One approach to the study of test-taking 
attitudes consists of the numerous studies 
demonstrating the susceptibility of scores on 
personality inventories to special instruc- 
tional sets, for example, “fake-good” or 
“fake-bad” sets (Dicken, 1960; Exner, Mc- 
Dowell, Pabst, Stackman, & Kirk, 1963; 
Hunt, 1948). A related approach consists of 
a series of social desirability studies which 
have also used a special instructional set, 
These studies have demonstrated that the 
social desirability values of personality 


1Now at Manhattan Veterans 
Hospital. 
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inventory items correspond to a remarkable 
degree to the actual frequency of endorse- 
ment of these items by various groups of 
normal Ss in the standard testing situation. 
Edwards (1957) has concluded from these 
social desirability studies that most Ss ap- 
proach the standard testing situation with a 
set to present themselves favorably. Both of 
these approaches, however, are subject to 
the criticism of their artificiality and ques- 
tionable relevance to real-life testing situa- 
tions (Hanley, 1961). While they highlight 
an important source of variance commonly 
found in standard personality inventories 
under these special instructional sets, these 
studies do not provide any direct evidence 
of the operation of such test-taking attitudes 
in typical assessment situations. 

Studies which have attempted to determine 
such influences empirically in real-life situa- 
tions most frequently have employed an ex 
post facto research design (Green, 1951; 
Nakamura, 1960). Typically, in these studies, 
test scores for a “defensive” group, such as 
job applicants, are compared with scores for 
another group, such as employees, who have 
been provided with a standard or research 
set. The interpretations of the results of 
these studies are limited because of the tenu- 
ous assumption that the groups are truly 
comparable on those traits reflected in the 
inventory scores except for the test-taking 
attitude singled out for investigation. $ 

The authors are aware of only two studies 
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of personality inventories which have ma- 
nipulated test-taking attitudes in a real-life 
situation while introducing experimental con- 
trol through the random assignment of Ss to 
the different conditions of testing. These two 
studies have demonstrated significant effects 
of this variable on test scores, although the 
differences in test scores obtained by this 
manipulation have been rather small in 
magnitude (Heron, 1956; Young, 1965). The 
present study represents a further attempt 
to evaluate the influence of test-taking at- 
titudes in a naturally occurring assessment 
situation. By including an evaluation of the 
S’s motivation which is presumed relevant to 
this assessment situation, the effects of both 
the test situation and the S’s motivation as 
mediated through test-taking attitudes can 
then be examined. 


MertuHop 
Subjects : 


All Ss were male veterans hospitalized in a 20-bed 
psychiatric discharge ward of a general medical 
and surgical hospital. Each S was selected for this 
study on the basis of his having been approved 
for discharge at a discharge conference, At this 
conference, the patient’s physician filled out a rating 
form requesting his judgment of the patient’s moti- 
vation for discharge from the hospital. On the 
basis of these ratings, Ss were assigned by a simple 
alternation procedure to one of the two testing 
conditions, with the provision that the Ns in the 
two testing conditions be equal at each level of 
motivation. The total group consisted of 50 Ss with 
a median age of 40.7 years (range 25 to 70 years), 
a median education of 10.5 years (range 6 to 16 
years), and median length of hospitalization prior to 
testing of 46 days (range 7 to 389 days). 


Testing Conditions 


Experience in clinical psychological testing on the 
Psychiatry Service suggested to the investigators 
that a defensive test-taking attitude would most 
likely be elicited#in a situation where the patient 
Perceived the appraisal as affecting discharge plan- 
ning. On the basis of this experience, the main 
experimental variable which was manipulated in this 
Study was the explanation of the purpose of the 
testing provided to the patient by his physician. 
Two sets of instructions supplied to the physicians 
then defined the two testing conditions used in this 
study. For the “defensive” set, the physician in- 
formed the patient that he wished to obtain the 
Tesults of psychological testing before deciding 
Whether the patient was ready for discharge. For 
the “routine” set, the physician informed the pa- 
tient that the psychological testing was only routine 


and while it would be entered in his records 
would not affect his discharge plans. In order 
make the explanation for the routine condition mi 
convincing to the patient, he was given a defin 
date for his discharge and was tested one or ty 
days prior to this time. 

At the discharge conference, when the patient w 
assigned to one of the two testing conditions, t 
physician was provided with a brief summary she 
reminding him of the instructions to give to H 
particular patient. At the time the patient w. 
tested, the investigators routinely inquired as to tl 
patient’s understanding of the purpose of the tes 
ing and repeated the explanation of the evaluatio 
as originally provided by the physician, Patien 
who were irregularly discharged and patients fc 
whom the investigators had some reason to douk 
that the proper instructions were carried out wer 
eliminated from the study prior to testing, In thos 
instances where the physician felt he could not wai 
for the discharge conference or testing procedur 
before informing the patient of his discharge plans 
the patient was also excluded from the study. Thus 
the investigators had complete control of the as. 
signment of the Ss to the experimental conditions 


Ratings of Motivation 


At the discharge conference, when the patient was 
selected for this study, the physician filled out a 
7-point rating scale based on his judgment of the 
patient’s attitude towards discharge, The instruc- 
tions for the rating stressed that it should be 
based as much as possible upon the patient's ex- 
pressed comments without the physician attempting 
to interpret what is the patient’s “underlying or 
true” attitude. The categories varied from one ex- 
treme labeled “demanding,” through a midpoint de- 
scribed as “accepting,” to the other extreme desig- 
nated “reluctant” in attitude towards discharge, On 
the same day the head nurse on the patient's ward 
was also asked to fill out this rating scale, For the 
analysis of variance, this 7-point scale was reduced 
to a 3-category scale combining the Ns in adjacent 
cells, Only the physicians’ ratings were used in the 
analysis of results. 


Tests Administered 


The following tests were administered in the order 
listed: Marlowe-Crowne Social Desirability scale, 
MMPI, Harrower Multiple-Choice Rorschach Test, 
and Rotter Sentence Completion Test, This report 
includes the results from only the Marlowe-Crowne 
Social Desirability scale and the MMPI. 

The MMPI protocols were scored for all the 
standard scales and the subtle and obvious keys of 
the clinical scales. In addition, the following special 
scales were included on the basis of their presumed 
relevance to test-taking attitudes, The notation for 
these scales follows the system used by Dahlstrom 
and Welsh (1960) : ier 

So-r: Edwards’ 39-item Social Desirability scale 


(Edwards, 1957). 
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Mp: Cofer et al. Positive Malingering Scale con- 
sisting of 34 items which changed under “fake-good” 
instructions, but did not change under “fake-bad” 
instructions (Cofer, Chance, & Judson, 1949). 

Tt: A 26-item Social Desirability Scale developed 
by Hanley consisting of items in the medium range 
of endorsement by the standardization group which 
are rated toward the extremes in high or low 
social desirability content (Hanley, 1957). 

Ds-r: A 40-item scale developed by Gough which 
discriminated significantly between neurotic Ss and 
Ss instructed to simulate neurotic answers (Gough, 
1954). 

Ad: A 32-item Admission of Symptom Scale by 
Little and Fisher based on a cluster analysis of 
the Hy scale (Little & Fisher, 1958). 

Dn: A 26-item Denial of Symptom Scale by 
Little and Fisher based on a cluster analysis of the 
Hy scale (Little & Fisher, 1958). 

Sd: A 40-item scale developed by Wiggins which 
differentiated between Ss answering under Sd in- 
structions and a control group answering under the 
standard instructions (Wiggins, 1959). Due to an 
oversight, only 29 of the 40 items on the Wiggins 
scale were included in this study. 

In addition to the special scales, two other indexes 
based upon scores on the standard scales were calcu- 
lated. The first was the F minus K index proposed 
by Gough as a measure of positive or negative 
malingering (Gough, 1950). The other index was a 
profile-elevation score based upon a linear weighting 
of the scores on the eight clinical scales (Dahlstrom 
& Welsh, 1960, p. 259). This profile-elevation index 
was calculated with the K correction factor in- 
cluded (profile *) and with the K correction factor 
eliminated (profile). 


RESULTS 


The results were analyzed by means of a 
two-way analysis of variance using a Treat- 
ments X Levels design (Lindquist, 1953). 
F tests were calculated for each of the two 
main effects, namely, those for the two testing 
conditions and those for the three levels of 
motivation for discharge, plus the interaction 
between these two main effects. 


Comparability of Groups 


F tests for age, education, and the length 
of hospitalization prior to testing were calcu- 
lated in order to determine the comparability 
of the six subgroups in this two-way analysis 
of variance. With three F tests for each vari- 
able, only one of the nine F tests proved 
significant at the .05 level—the one for the 
conditions of testing with the education vari- 
able. Inspection of the subgroup means re- 
vealed that Condition 1, the defensive condi- 
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tion, had a significantly higher mean years of 
education due to a larger number of Ss who 
had completed college. Because of the pos- 
sibility that this discrepancy in education for 
the two groups might have affected the main 
analysis, correlations for the entire sample of 
Ss were computed between education and all 
the special scales, validity scales, and the 
clinical scales which showed significant ef- 
fects for the conditions of testing in the main 
analysis. Using a .10 level of confidence, only 
1 of these 14 correlations was significant. A 
negative correlation of minus .49 was obtained 
between education and the Ma obvious key. 
Since 12 of the 13 other correlations ranged 
between plus or minus .17, it was not felt 
that the educational discrepancy between 
these groups significantly affected the main 
results for the analysis of variance. 

The comparability of the six subgroups for 
the final or discharge diagnosis was also 
checked. Inspection of these diagnoses sug- 
gested a three-way classification into neurotic, 
psychotic, or acute brain syndrome (alcoholic 
intoxication). Two-way contingency tables 
were constructed for each diagnostic clas- 
sification and chi-squares were calculated. 
The diagnostic categories appeared to be 
randomly distributed in the subgroups with 
none of these chi-squares approaching the 
.05 level of significance. 


Reliability of Ratings 

An estimate of the reliability of the ratings 
for motivation for discharge was made by 
comparing the ratings of the head nurse with 
those made by the patient’s physician. Since 
not all patients had the same physician, this 
procedure involved the correlation between 
the ratings of one head nurse with the ratings 
made by eight different physicians. The 
Pearson r for the seven scale categories was 
-64, while the correlation for the 3-category 
scale, which was actually used in the analysis 
of variance, was .49. Both coefficients are 
significant beyond the .01 level of confidence. 
Considering the sources of unreliability for 
these ratings, in particular the number of 
different physicians involved for the one set 
of ratings and the expected variations in types 
of patient contact between the head nurse 
and the physician, those correlations suggest 
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that the judgments were not difficult for the 
raters to make. 


Special Scales 


Table 1 presents the results of the analyses 
of variance for the Marlowe-Crowne scale, 
for the special scales, and for the validity 
scales of the MMPI. The analyses of 
variance for these 12 scales revealed four 
significant effects for levels of motivation 
(Marlowe-Crowne, Mp, Sd, L), three signifi- 
cant effects for testing conditions (Mp, Dn, 
K), and two interaction effects (Tt, Ad) 
using a .05 level of confidence. Of the four 
scales which showed no significant effect, two 
(F and Ds-r) were designed primarily to 
detect a “faking-bad” test-taking attitude, 
which was an attitude presumably not rele- 
vant to the testing conditions employed in 
this study. The F minus K index, which also 
proved to have no significant effects, has been 
reported to show little success in detecting 
“faking-good” compared to “faking-bad” in- 
structional sets (Gough, 1950; Hunt, 1948). 
The last column in Table 1 reports the 
point-biserial correlations for each scale with 
the conditions of testing, which gives a rough 
indication of the relative efficiency of these 
scales in discriminating between the two 
testing conditions. 

An inspection of the cell means reveals 
that for all 12 scales the differences in mean 


TABLE 1 


ANALYSIS OF VARIANCE FOR VALIDITY 
AND SPECIAL SCALES 


Testin; Motiva- Inter- j 
sht aaa tion action | ”Pbis 
Marlowe- ** 094 
Crowne 
So-r 214 
Mp x .281 
Tt * 178 
Ds-r —.189 
Ad * —.220 
Dn * .282 
Sd ++ 133 
E —.085 
K * 291 
L * 187 
ene —.186 
* D <.05, 
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TABLE 2 
ANALYSIS OF VARIANCE FOR CLINICAL SCALES 


Testing 


Stale conditions 


Motivation | Interaction 


Profile (with K) * 
Profile (with- 
out K) 


Note.—Abbreviated : O = obvious key; 5 = subtle key, 
* p< .05. 


scores for the two testing conditions were in 
the predicted direction, that is, the patients 
in Condition 1 had mean scores reflecting 
more defensiveness as compared with the pa- 
tients tested under the routine condition. For 
9 of the 12 scales, the difference in means for 
the two testing conditions was greater for the 
group of Ss showing the greatest motivation 
for discharge than for either of the other 
two groups of Ss. 


Clinical Scales 

Table 2 presents the results of the analyses 
of variance for the 8 clinical scales, the 10 
subtle-obvious keys, and the 2 indexes of 
degree of profile elevation. Of the eight clin- 
ical scales, Pt showed a significant effect for 
testing conditions, Hy a significant effect for 
levels of motivation, and Hy and D a sig- 
nificant interaction effect. An inspection of 
the differences in mean scores for the two 
testing conditions revealed that for the two 
profile-elevation indexes, the eight clinical 
scales, and the five obvious keys, these dif- 
ferences were all in the predicted direction, 
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in that the mean scores for the defensive test 
condition were lower. For each of these scales 
the difference in means for the testing condi- 
tions was greater for the group of Ss showing 
the greatest motivation for discharge com- 
pared to either of the other groups. The 
group means for the subtle keys followed a 
different pattern. For three of the five subtle 
keys the mean for the defensive condition was 
higher than the corresponding mean for the 
routine condition, and for the depression 
subtle key this difference was significant at 
the .05 level. 


DISCUSSION 


The results of this study are consistent 
with studies by Heron (1956) and Young 
(1965) in demonstrating that situational fac- 
tors which are typical of standard assessment 
situations can influence scores on personality 
inventories. The study by Young provides a 
basis of comparison for the results of the 
present study, since he manipulated the test 
situation in an almost identical fashion, for 
example, the experimental subjects were told, 
“you must pass the test in order to leave 
the hospital.” Using individual change scores 
from initial to discharge testing, Young re- 
ports changes in the predicted direction on 
F, Hs, and Pa at the .05 level, and on D and 
Pt at the .01 level, with no significant changes 
on L, K, and the other clinical scales. The 
greater number of significant effects found by 
Young on the standard scales may be due to 
the greater precision of his study attributable 
to the use of a larger N and change scores. 
Another major difference between the two 
studies is the inclusion of female subjects in 
the Young study. The tendency towards a 
greater effect on the D and Pż scales is com- 
parable to the present study with the dif- 
ference in results for the K scale being the 
most striking discrepancy. Both studies sug- 
gest that reported subjective discomfort or 
dysphoria is most apt to be influenced by 
this situational factor. 

Since role-playing studies have been criti- 
cized for their lack of relevance to real life 
assessment situations, it is of interest to 
compare the results of the present study with 
some of the Sd studies for the standard 
scales. Wiggins (1959), who utilized two 
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separate groups for the standard instructions 
and role-playing Sd instructions, obtained a 
significant decrease on Hy (obvious), Pt, and 
a significant increase on Ma for the Sd set. 
Rosen (1956), who used the same Ss for both 
standard and role-playing instructions, re- 
ports a significant decrease in scores on Hs 
and D-O and an increase in scores on D-S, 
Hy-S, Pa-S, and Ma-S under the Sd instruc- 
tions. Exner et al. (1963), who also used the 
same Ss for both instructions report signifi- 
cant changes on L, F, K, Pd, F minus K, and 
F plus K under “fake-good” instructions. 
There seems to be little consistency from one 
study to the next as to which scales are 
affected by test-taking attitude. The lack of 
such consistency may be due in part to dif; 
ferences in Ss, in instructional sets, or in the 
use of difference scores versus comparisons of 
separate groups. These studies, however, are 
consistent on two points. First, when com- 
parisons are made between obvious and subtle 
items on the various scales, it is apparent 
that test-taking attitudes affect the responses 
to these items in different ways. In contrast 
to the tendency to reduced pathological scores 
on obvious items under defensive or social 
desirability sets, responses to subtle items 
show either no effect or an effect in the oppo- 
site direction. The second consistent finding 
is that the magnitude of the effect upon 
group means is not great; the degree of over- 
lap in scores between groups under defensive 
and routine sets makes it difficult to develop 
scales or scoring indexes which differentiate 
the two groups with any efficiency. The influ- 
ences of “fake-bad” sets, however, seem to be 
much easier to detect (Exner et al., 1963; 
Hunt, 1948). An examination of mean scores 
on such scales as K, Marlowe-Crowne, and 
the Edwards Social Desirability scale does 
not suggest that the experimental manipula- 
tion of the test situation in this study elicited 
any high degree of defensiveness as meas- 
ured by these scales. For example, the mean 
T scores on the K scale were 57 and 49 for 
Condition 1 and Condition 2, respectively. In 
general, a comparison of the mean scores ON 
these special test-taking scales with those ob- 
tained in the role-playing studies, indicates 
that the means obtained in this study were 
consistently below those obtained under Sd or 
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“fake-good” instructions and were generally 
more comparable to scores obtained under the 
routine sets with college students. 

Of greater practical significance is the evi- 
dence for interactional effects in this study. 
The F tests and inspection of group means 
strongly suggest that the degree of defensive- 
ness aroused by the testing situation was as- 
sociated with the patient’s motivation for dis- 
charge. This interactional effect would prob- 
ably have been demonstrated even more 
clearly if Ss showing a greater dispersion on 
the attitude towards discharge variable had 
been selected. Due to the way in which Ss 
were obtained for this study, very few Ss 
were rated as even slightly negative in their 
attitudes towards discharge. The highly moti- 
vated group was rated as not simply “request- 
ing” discharge, but actually “pressing” or 
“demanding” in their attitude towards dis- 
charge. This rating suggests more than just 
high motivation for discharge; it implies a 
rather anxious or defensive attitude in which 
the patient perceives the physician as a po- 
tential barrier to his release from the hos- 
pital. Assuming that this attitude toward the 
referring physician is also directed toward 
the test situation, we may hypothesize that 
patients are likely to respond primarily with 
the defensive set only when the test situation 
is perceived as an obstacle to their immedi- 
ate goals. These results also suggest the feasi- 
bility of identifying these patients through 
behavior and information obtained apart 
from test responses. 

In examining the results for the validity 
and special scales, it is of interest to note 
which scales tend to be influenced primarily 
by the conditions of testing variable and 
which by the motivational variable. For the 
motivational or attitude towards discharge 
variable, the Marlowe-Crowne, Mp, Sd, and 
L scales were all significant. For the condi- 
tions of testing variable, the Mp, Dn, and K 
scales were significant. Several factor-analytic 
studies of these kinds of Sd scales have con- 
sistently distinguished two types of Sd scales, 
based on their factor loadings (Edwards, 
Diers, & Walker, 1962; Liberty, Lunnebory, 
& Atkinson, 1964; Wiggins, 1964). For ex- 
ample, in the factor-analytic study of a num- 
ber of response-set and response-style scales, 
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Wiggins (1964) found So-r, K, Tt, and Mp 
had significant loadings on the first factor 
while Marlowe-Crowne, Sd, L, and Mp loaded 
on a factor which he termed “social desir- 
ability role playing” and which Edwards et 
al. (1962) call a “lying” factor. The congru- 
ence between these studies is remarkable, 
particularly for those scales which load on 
the role-playing factor. If we interpret this 
second group of scales according to Crowne 
and Marlowe’s (1964) concept of a need for 
approval, a reasonable explanation for the 
correlation of scores on these scales with the 
ratings for attitude towards discharge is ap- 
parent. Inspection of the row means for these 
scales indicated that it was primarily the low 
scores for the group rated as lowest in their 
motivation for discharge which produced the 
levels effect. As a check on this impression, 
the significance of the differences between 

each pair of row means was tested by ¢ test. 

Using a .05 level of significance, the ¢ tests 

for these four scales showed that in each case 

the lowest motivated group differed signifi- 

cantly from the other two groups, while the 

latter groups did not differ from each other. 

Since the philosophy of treatment on the Psy- 

chiatry Service is oriented toward short-term 

hospitalization and since all of the patients in 

this study had been evaluated by their physi- 

cians as ready for discharge, it is reasonable 

to assume that the low-motivated group of 

patients was perceived by the physicians as 

somewhat resistent toward the discharge rec- 

ommendation. Thus, a patient showing less 

need for approval in his interpersonal orien- 

tation would be less likely to comply with the 

prevailing patient-staff culture, which empha- 

sizes that fairly early in his hospitalization he 

should be oriented primarily toward leaving 

the hospital. 
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THERAPIST RESPONSES TO HOSTILITY AND DEPENDENCY 


AS A FUNCTION OF TRAINING? 


MARTIN J. BOHN, Jr. 


Washington University 


Therapist responses to a typical client, a hostile client, and a dependent 
client were studied as a function of the Ss’ training. Ss were 18 advanced 
graduate students in a course in theories and techniques of psychological 
counseling. The course included didactic material with supervised experience 
in the form of role playing, structured interviews, and practice counseling. 
Responses to tape recordings of these clients were obtained before and after 
the course and were scored for directiveness. Results showed Ss to be in- 
creasingly directive to the typical, hostile, and dependent clients, respectively, 
on both administrations and showed Ss to be less directive to all 3 clients on 
the 2nd administration. The decrease in directiveness to the dependent cli- 
ent, however, was not significant. Implications suggested were that different 
clients elicit different responses from the same therapist, and that training 
may affect responses to hostility more than it affects responses to dependency. 


How therapists respond to clients has been 
studied on numerous dimensions. One ap- 
proach has been to relate a therapist’s behav- 
ior to his own personality attributes, such as 
experience, training, needs, drives, or peculi- 
arities. Therapist behavior has also been stud- 
ied as a function of the client with whom he 
is working. A third approach to this topic 
has been to investigate the process of therapy, 
focusing on the interaction between the thera- 
pist and the client. This study investigated 
therapists’ responses to clients with emphasis 
on the therapist dimension of training and on 
the client dimensions of hostility and depend- 
ency. 

Client hostility has been shown to be an 
effective influence on therapist behavior. For 
example, Bandura, Lipsher, and Miller (1960) 
found therapist response to hostility to be 
related to the direction of the expressed hos- 
tility. Hostility directed toward the therapist 
did not elicit as many positive or “approach” 
responses as did hostility expressed toward 
Someone else, These therapist responses were 
telated to client’s own future expressions or 


1The data for this study were gathered while the 
author was at the University of Iowa. Grateful 
acknowledgment is made to Willis D. Poland and 
Janis H. Weiss who conducted the supervisory 
Seminars and course described in this paper. 

This is a slightly revised form of a paper pre- 
sented at the Midwestern Psychological Association 
convention, Chicago, 1966. 


absence of expressions of hostility. Russell 
and Snyder (1963) have also reported the 
finding that hostile client behavior produced 
more anxiety in counselors than did friendly 
client behavior. 

Another client behavior which has been 
investigated in relation to therapist responses 
is that of dependency. Similar to the results 
of Bandura et al. (1960) with hostility. Sny- 
der (1963) found that therapist responses to 
dependency were related to the direction of 
the expressed dependency. When the client 
dependency was expressed toward someone 
other than the therapist, the therapist’s level 
of reassurance was below his average. On the 
other hand, when the dependency was directed 
toward the therapist, his reassurance level 
went above his average. 

In a study which manipulated both client 
hostility and client dependency simultane- 
ously, Heller, Myers, and Klein (1963) found 
both of these variables to be effective determi- 
nants of the therapist’s responses. As pre- 
dicted, therapists were more likely to be di- 
rective and reassuring in response to depend- 
ency. In response to hostility, therapists were 
more likely to respond in a less friendly or 
avoidant manner. 

In addition to the client dimensions of 
expressed hostility and dependency, effects of 
therapist training were also of interest in this 
study. Although results of research in this 
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area are not unanimous, there is support for 
the conclusion that experienced, trained 
therapists tend to be more effective and suc- 
cessful therapists than are beginners. Hope- 
fully, some of this difference is a result of 
training. A number of studies have demon- 
strated positive effects of training experience 
on counselor behavior (Carkhuff & Truax, 
1965; Demos & Zuwaylif, 1963; Grigg, 1961; 
Jones, 1963). 

The present study investigated therapist 
responses to three clients. Responses to these 
clients were obtained at two different times in 
the therapists’ training. In order to minimize 
the client variation and to assure that the 
subjects responded to the identical stimuli, 
tape recordings were used. Presenting the cli- 
ents on tape was taken as a step in the direc- 
tion of Strupp’s ideal test of therapist fac- 
tors in counseling, that is, presentation of an 
identical client to each subject under identi- 
cal circumstances (Strupp, 1960). 


MetHop 
Subjects 


The subjects in this study were 18 students (16 
male, 2 female) in a graduate psychology course in 
theories and techniques of psychological counseling. 
The course was given in the second year so that the 
students had all completed at least 1 year of gradu- 
ate work before the time of the course. 


Instruments (Tapes) 


The tapes were portions of simulated initial inter- 
views; at selected times during the tape there were 
silences and the subjects responded via a multiple- 
choice format to client remarks. The alternatives 
offered to the subjects had previously been scored 
for directiveness and response category according to 
a system based on Snyder’s (1945) categories. This 
categorization has been used with some modifica- 
tion in several studies (Aronson, 1953; Bohn, 1965; 
Frank & Sweetland, 1962; Snyder, 1963; Parker, 
1963) and has been shown to have an acceptable 
level of interjudge agreement (Fogel, 1957). 

The tapes were developed to represent three client 
types: the typical client, the hostile client, and the 
dependent client. The typical client complained of 
some difficulties in his life, but offered minimal com- 
plications in relating to the therapist. The hostile 
client directed some of his strong negative feelings 
toward the therapist, while the dependent client 
asked the therapist for support and reassurance. 
All three clients were young men of college age. 

Material for these tapes of client types came from 
varied sources. Text for the hostile client was based 
on a hostile client in the study by Heller et al. 
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(1963). The sound track from a film of a dependent 
client (Strupp & Jenkins, 1963) served as the de- 
pendent client in this study. The material for the 
typical client was adapted from a case taken from 
the files of a university counseling service. 


The Course 


The course in which these students were enrolled 
included didactic course work and supervised prac- 
tice. The didactic material covered theories (Roger- 
ian, psychoanalytic, and directive) and techniques. 
This material was integrated with practice in three 
sets of interviews. The initial interview dealt with 
the definition of the problem; the second interview 
was devoted to goal setting. Two more interviews 
were held in this first set, building on the problem 
definition and goals established. In the second set 
of four interviews, the students were not given spe- 
cific instructions, except that they were to develop a 
problem and to talk about it. In these two sets of 
interviews the students in the course participated in 
eight interviews as a therapist and eight interviews 
as a client. For the third set of interviews, the stu- 
dents were involved in three “pseudocounseling” en- 
counters with undergraduate girls who were trained 
to act as clients, These girls reputedly had been 
chronically delinquent in regard to some university 
curfew rules and, in lieu of punishment, had agreed 
to participate in a series of diagnostic interviews. 
The “therapist” was to use these interviews diag- 
nostically to determine the personality structure of 
the acting-out girls and the appropriate treatment. 
After the course was finished, the students were in- 
formed that the “clients” were actresses. As far as 
could be determined, none of the subjects was aware 
that these girls were not genuine clients until after 
the final interview. 

Students tape recorded their interviews and were 
given individual supervision by more advanced 
graduate students who were in their final year of 
training. These more advanced graduate students, in 
turn, tape recorded the supervising sessions and par- 
ticipated in a weekly seminar on counseling super- 
vision. 


Procedure 


Early in the semester the tapes of the three clients 
(typical, hostile, and dependent) were played to the 
members of the class. At the end of the semester 
the tapes of the three clients were again presented 
to the students. The subjects’ responses were ana- 
lyzed for directiveness with comparisons being made 
among the three clients and between the first and 
second administrations, 


RESULTS 


In Table 1 the directiveness-score means 
and standard deviations for the three clients 
in both administrations are presented. Over- 
all, the subjects responded with increasing 


> 


THERAPIST Responses TO HOSTILITY AND DEPENDENCY 


directiveness to the typical, hostile, and de- 
pendent clients, respectively. This trend oc- 
curred in both the first and second adminis- 
trations, being more pronounced after the sub- 
jects had taken the course. A second general 
finding reflected in these data was that the 
subjects were less directive with all three cli- 
ents in their responses at the time of the sec- 
ond administration. 

The results of the two-factor analysis of 
variance are shown in Table 2. The Client X 
Administration interaction was not significant, 
permitting a test of the main effects of client 
and administration. Both of these main ef- 
fects were significant, suggesting that the dif- 
ferences related to these factors were mean- 
ingful; that is, it seems that the subjects did 
respond differentially to the three clients, and 
differentially on the two administrations. 

Results of ¢ tests of specific mean differ- 
ences suggest that the differences between 
the typical client and the dependent client 
were significant on both administrations (¢ = 
2.63, p < .05; t= 4.07, p < .01). The differ- 
ences between the typical client and the’ hos- 
tile client were significant only on the second 
administration (¢ = 2.53, p < .05). Finally, 
the differences between the first and second 
administrations were significant for the typi- 
cal and hostile clients (¢ = 3.86, p < .01; t 
= 2.29, p < .05) but not for the dependent 
client. 


DISCUSSION 


This study investigated therapist responses 
to client hostility and dependency at two 
levels of training. The results reaffirm a fact 


TABLE 1 


DmecrIvEnEss-SCORE MEANS AND STANDARD DEVIA- 
TIONS CLASSIFIED BY CLIENT TYPE AND 


ADMINISTRATION 
Client type 

Adminis- f Depend- 
tration Typical Hostile ent 
First M 2.94 4.00 4.11 
W=18) SD 1.21 217 1.45 
Second M 1.72 2.61 3.61 
W=18) SD 0.57 1.38 1.88 
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TABLE 2 


SUMMARY OF ANALYSIS OF VARIANCE FOR DIRECTIVE- 
NESS SCORES CLASSIFIED BY CLIENT TYPE 
AND ADMINISTRATION 


Source of variation df MS F 
Client 2 14.52 6.04* 
Administration 1 43.05 17.93* 
Client-administration 2 2.02 84 
Within 102 2.40 

*p <.01, 


which is well known about therapy but some- 
times disregarded: Individual therapists re- 
spond differently to different clients. This was 
more clearly evident on the second adminis- 
tration than on the first and is consistent with 
the belief that therapists with more training 
and experience can demonstrate more flexi- 
bility and variation than can novices. On the 
first administration, the typical client was 
responded to in a manner distinctly different 
from the hostile client or the dependent cli- 
ent. The hostile and dependent clients were 
responded to with approximately the same 
level of directiveness. On the second adminis- 
tration, no two clients elicited particularly 
similar levels of directiveness. 

In general, the subjects were less directive 
to all clients on the second administration. 
This finding is similar to earlier results using 
these tapes in which more experienced sub- 
jects were less directive (Bohn, 1965). There 
is also some evidence in the literature sug- 
gesting that more experienced individuals are 
seen as less probing, advice-giving, or direc- 
tive (Demos & Zuwaylif, 1963; Grigg, 1961; 
Jones, 1963). 

In this study, the subjects were generally 
less directive after taking the course and par- 
ticipating in the interview experience. This 
decrease in directiveness, however, was selec- 
tive. On the one hand, responses to the typi- 
cal and hostile clients were notably less di- 
rective, as would have been expected with 
these tapes; previous findings have shown ex- 
perienced counselors to be less directive in 
response to these clients than to the depend- 
ent client (Bohn, 1965). On the other hand, 
responses to the dependent client indicated 
no such significant decrease in directiveness. 
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On both administrations, the subjects re- 
sponded to the dependent client at a compa- 
rable level of directiveness. It may be that it 
was easier for the subjects to learn to cope 
with hostility than with dependency. After 
training the subjects did not react so obvi- 
ously to the attacks of the hostile client. 

Generalizations from these results are lim- 
ited by a number of considerations. The 
range of experience change in these subjects 
is narrow; it is recognized that one semester of 
training does not represent the amount of ex- 
perience necessary for an individual to pro- 
gress from the status of a beginner to that of 
an expert. If the time span and amount of 
training between these two administrations 
were to be increased, one might expect more 
overall change in the therapists’ perform- 
ances, Another limitation is the fact that the 
therapists’ responses were obtained in an 
analogue situation rather than an actual 
counseling interview. The final test of these 
data will come in replication of similar find- 
ings in bona fide therapy situations. 

A major implication of these findings is 
that different clients elicit different responses 
from the same therapist. For research and 
for practice this underscores the fact that 
therapy or counseling is not a uniform treat- 
ment offered in the same form to every client. 
An individual therapist’s performance is, to 
a certain extent, a function of the client’s be- 
havior. These data go beyond Kiesler’s (1966) 
“therapist uniformity myth” which states that 
therapy is a uniform treatment condition ap- 
plied in the same manner by different thera- 
pists. It appears that within the same thera- 
pist there are significant differences in thera- 
peutic approach in response to different cli- 
ents. 

Finally, dependency and hostility have 
been shown again to be two aspects of a cli- 
ent’s behavior which can produce demonstra- 
ble effects on the therapist’s performance. 
Training does not affect therapist responses 
to these behaviors in necessarily the same 
way. It is possible that didactic and experi- 
ential training as currently used is more ef- 
fective in changing therapists’ responses to 
hostility than in changing therapists’ re- 
sponses to dependency. 


Martin J. Bown, Jr. 
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PARENTAL MALEVOLENCE AND CHILDREN’S 
INTELLIGENCE * 


JOHN R. HURLEY 
Michigan State University 


Independent interview responses of mothers and fathers to 4 indexes of 
malevolent behavior were correlated with CTMM IQ scores of 451 3rd 
graders, The results supported the following conclusions: (a) parental malevo- 
lence and children’s IQs are negatively correlated; (b) mother-daughter pairs 
show the strongest association between parental malevolence and child’s IQ; 
(c) parental education and socioeconomic status account for little of the 
covariance between parental malevolence and child’s IQ; (d). the linkage 
between child’s IQ and parental malevolence is more apparent among parents 
having less than a high school education than it is among parents who have 
attended college. The interpretation that parental malevolence is causally 
related to children’s intelligence appears more defensible than alternative 


interpretations. 


It seems widely accepted that rejecting or 
hostile parental behaviors adversely influence 
children’s adjustment, even though some of the 
most persuasive evidence is of rather recent 
vintage (Bandura & Walters, 1959; Winder 
& Rau, 1962). Little is yet known, however, 
of the effect of such parental behaviors upon 
children’s intellectual development. Suggest- 
ive evidence that children’s intelligence may 
be adversely affected by parental hostility is 
found in publications by Baldwin, Kalhorn, 
and Breese (1945), Hurley (1959), Digman 
(1963), and Bayley and Schaefer (1964). 
Kagan and Freeman (1963) and Kagan 
(1964) have reported inverse relationships be- 
tween children’s IQs and maternal restrictive- 
ness, severity, coercion, and criticism. The 
case for a negative relationship between in- 
dexes of parental rejection or punitiveness 
with children’s IQ scores seems most clearly 
supported by a precursor of the present study 
(Hurley, 1965). 

This latter report was limited to a sample 
of 143 families of third-grade children who 
voluntarily returned a mail-out questionnaire 


1 The data were obtained through the cooperation 
of the Mental Health Research Center, Rip Van 
Winkle Foundation, Hudson, N. Y. This project 
was supported in part by Grant M1726 from the 
United States Public Health Service, Leonard D. 
Eron, principal investigator. Grateful acknowledg- 
ment for their essential services is extended to the 
Rip Van Winkle Foundation research staff and to all 
those school officials, parents, and children of Co- 
lumbia County who contributed to this study. 


concerned with child-rearing attitudes. Among 
the provocative results of that study were 
the following: (a) evidence of a negative 
correlation between children’s IQs and three 
different indexes of parental rejection; (5) 
indications that this linkage was stronger be- 
tween mother-daughter pairs than between 
other parent-child combinations; (c) although 
only a minor portion of the covariance be- 
tween child’s intelligence and parental rejec- 
tion could be accounted for by parental dif- 
ferences in educational and socioeconomic 
status (SES), trends suggested a particularly 
marked association between parental rejection 
and child’s IQ among the few families repre- 
senting distinctly low levels of parent educa- 
tion and SES. Designed to marshall more 
definitive evidence on such issues, this study 
is essentially an extension of the prior investi- 
gation. It differs in employing a broader 
range of parent-behavior measures and uti- 
lizes a more representative sample. 

The terms rejection, hostility, and puni- 
tiveness have been used somewhat inter- 
changeably in the preceding discussion of the 
central parental variable. This indefinite ter- 
minology arises from the use of differentially 
labeled measures in the relevant earlier stud- 
ies. It seems probable, however, that these 
measures have a substantial common core in 
the love-hate or acceptance-rejection dimen- 
sion, which is apparently the most prominent 
and stable molar parental behavior variable 
relevant to the parent-child relationship 


199 


200 


TABLE 1 


Propuct-MoMENT CORRELATIONS AMONG PARENT- 
INTERVIEW VARIABLES 


Pun Agg Rej JP SES EL 
Pun (.35**) 4240 19% „174% —.15* —,32** 
28% (.25**) 26% 10* = —,18** —.44% 

Re :10* "14+ (42) ‘09+ —.03  —.05 

È .00 „16% 03 (.13*) —.07 —.14* 
ES | —.17** —,27%% —.01 —.14* (*) .30** 
—.27 38+ —.01 —.12* 48% (50%) 
Note.—All Ns = 451, mothers’ data above diagonal, 


fathers’ below. 
a Determined only from father’s occupational status. 
* p <.05, two-tailed test. 
** > < .001, two-tailed test. * 


(Baldwin, Kalhorn, & Breese, 1945; Milton, 
1958; Schaefer, 1959; Zuckerman, Ribback, 
Monashkin, & Norton, 1958). In addition, 
this dimension has been independently estab- 
lished as a major component of adult behav- 
ior (Adams, 1964; Leary, 1957). Parental 
malevolence was selected as the most appro- 
priate caption because this label readily sub- 
sumes the subvarieties of more specific be- 
haviors represented by the several concep- 
tually related measures presently employed. 


METHOD 


This sample consisted of 206 girls and 245 boys 
and their mothers and fathers, These children consti- 
tuted about 55% of all third-graders enrolled in the 
public and private schools of a rural New York 
county in 1960. As Toigo (1965) has described, this 
was a rather typical United States rural county and 
the sample included all families where both parents 
participated in voluntary home interviews and whose 
children had completed several classroom measures, 

All of the parent behavior indexes used were taken 
from an objective, 286-item, precoded interview 
schedule which was administered separately to each 
parent. Details of the sample and interviewing pro- 
cedures have been fully described by Toigo (1965). 
Two of these measures were employed in Hurley’s 
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(1965) preliminary study—the Punishment (Pun) 
scale and the Judgment of Punishment (JP) index, 
The 24 Pun scale items deal with the parents’ stated 
probable responses to two types of aggressive behav- 
ior by their child, aggression directed toward a 
parent and aggression directed toward another child. 
Eron, Walder, Toigo, and Lefkowitz (1963, p. 852) 
have fully described the Pun items. The JP index 
reflected the parents’ judgment of the severity of 13 
common punishments for their child along an 8- 
point continuum ranging from “very harsh” to 
“very mild.” A third parental measure was the 
Aggression (Agg) scale developed by Walters and 
Zak (1959). It consists of 12 items dealing with 
the respondent’s inclination to react with irritation 
or hostility to the described behavior of self or 
other adults. The fourth parental measure was la- 
beled the Rejection (Rej) scale and was constituted 
by 12 items affording the respondents an oppor- 
tunity to express complaints about their own child. 

Children’s intelligence was assessed in the third- 
grade classrooms using the California Test of Men- 
tal Maturity (CTMM), 1957 S-form. This CTMM 
version has an internal consistency coefficient of .75 
(California Test Bureau, 1957) and also correlates 
about .75 with both the Wechsler Intelligence Scale 
for Children and the Stanford-Binet Intelligence 
Scale. 

Parental education and SES indexes were ob- 
tained from the parent interview. Educational level 
(EL) was categorized on a 7-point scale: (a) gradu- 
ate or professional training, (b) college graduate, 
(c) 1-3 years of college, (d) high school graduate, 
(e) 10-11 years of education, (f) 7-9 years of educa- 
tion, and (g) under 7 years of education. SES was 
classified according to United States Bureau of 
Census (1960) method, utilizing 10 categories of 
father’s occupation. Researchers (Kohl & Davis, 
1955; Lawson & Boek, 1960) have demonstrated that 
this technique yields an index of social class which 
is as meaningful as any combination of other factors. 


RESULTS 


Table 1 contains product-moment correla- 
tions among the six parent-interview variables 


TABLE 2 
Propuct-MoMENT CORRELATIONS BETWEEN CHILD’S IQ AND PARENTAL MALEVOLENCE INDEXES 


Sons (N = 245) Daughters (N = 206) All Children (N = 451) 
Parent 
Pun Agg Rej JP Pun Agg Rej JP Pun Agg Rej je 
Father —.24** —17* —.02 —.03 | —.28** —.19* —.14* —.06 | —.25** —.18** —.08 —.05 
Mother | —.24** —.20* —.11 —.00 | —.31** —.35** —.16* —.11 | —.27** —.27** —.14** —.05 
Both —.24** —18* —.07 —.01 | —.28** —.27** —.15* —.09 | —.26** —.22** —.11* —.05 


* p <.05, two-tailed test.. 
** < .001, two-tailed test 
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TABLE 3 
PARENT PUN AND AGG Score MEANS AND PRODUCT-MOMENT CORRELATIONS AT 
THREE LEVELS OF PARENT EDUCATION 
Mothers Fathers 
Educational level Means Correlations Means Correlations 
N N 
Pun Agg Pun Agg Pun Agg Pun Agg 

1 Year college or more 
(Child’s mean IQ = 109.2)} 113 9.0 | 146 |—.04 |—.01 | 100 8.3 | 144 | —.01 | —.05 
High school graduates 
(Child’s mean IQ = 105.7)| 173 9.7 | 15.4 | —.08 | —.24*| 139 9.5 | 15.2 | —.10 | —.16* 
11 Years or less 
(Child’s mean IQ = 99.4) | 165 12.1 | 16.1 | —.19* | —.23*| 212 10.8 | 16.2 | —.31**| —.09 


* p <.05, two-tailed test. 
** p <.001, two-tailed test. 


with the mother versus father correlations 
bracketed along the diagonal. 

Table 2 gives product-moment correlations 
between childs CTMM IQ and parents’ 
scores on the four malevolence indexes. Table 
3 presents correlations between child’s IQ and 
parents’ scores on the Agg and Pun measures 
at three different levels of parental education 
and means on the Agg and Pun indexes. 


DISCUSSION 
Parental Interview Measures 


Supporting the malevolence designation of 
the Agg, Pun, Rej, and JP indexes was the 
statistical significance of 10 of the 12 inter- 
correlations among these variables noted in 
Table 1. These correlations also identify the 
Pun and Agg indexes as nearly tied for great- 
est communality within the malevolence vari- 
able among mothers, while among fathers the 
Agg index has greatest communality. With 
both parents, JP correlated least with the 
remaining three variables. Plainly, the more 
projective JP score was the least satisfactory 
of these four measures. Perhaps the Pun in- 
dex is the most generally adequate and mean- 
ingful of these malevolence measures in that 
it possesses greatest internal consistency, 
probably due to its containing 60-100% 
more items than the other, rather short 
scales and relates nearly as well to Rej and 
JP as does Agg. Pun also reflects greater 
mother-father consensus than Agg and cor- 


relates with SES and EL to a lesser degree 
than does Agg. Because of their generally 
greater communality with the other malevo- 
lence indexes, Pun and Agg were selected over 
the Rej and JP measures to more fully explore 
the relationships between parental malevo- 
lence and child’s IQ among parents differing 
in EL. The intercorrelations among mothers’ 
scores on the malevolence items tended to be 
higher than those of their husbands except in 
the case of JP versus Agg. 

From the Table 1 data, it is also clear that 
parental EL relates more highly to the malev- 
olence indexes than does SES despite the sub- 
stantial EL-SES correlation. SES relates so 
modestly to the four malevolence indexes 
that little of the latter’s covariance with 
child’s IQ, as described in Table 2, can be 
attributed to SES. However, all correlations 
between these malevolence indexes and EL or 
SES are negative. Plainly, lower levels of 
parent education and SES are associated with 
a more undesirable parental orientation than 
accompanies higher parent EL or SES, what- 
ever the sources of these associations may be. 


Child’s IQ and Parental Malevolence 


The exclusively negative correlations among 
all four parental indexes and children’s 1Q 
confirm the preliminary findings (Hurley, 
1965). Most of the statistically significant 
associations are between child’s IQ and the 
Pun and Agg variables. Parent-daughter cor- 
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relations exceed parent-son correlations in all 
eight relevant comparisons, although none of 
these individual differences is statistically 
reliable. In the perspective of six out of a 
possible six similar differences in Hurley’s 
(1965) prior study, it seems established that 
a daughter’s IQ is more closely linked with 
parental malevolence than is a son’s IQ. 
While the mother-child correlations in Table 
2 tend to be slightly higher than the cor- 
responding father-child relationships, these 
differences are not significant. 

Mothers’ behavior tends to correlate most 
highly with daughters’ IQ, as all 12 possible 
comparisons between mother-daughter pairs 
versus other parent-child pairings were in the 
same direction. Again confirming the earlier 
report (Hurley, 1965), this result sharply 
conflicts with Schaefer and Bayley’s (1963) 
report of “more significant correlations of 
maternal behavior with sons’ behavior than 
with daughters’ behavior [p. 95].” Since the 
Kagan and Freeman (1963) data strongly 
support the present findings, it seems likely 
that this discrepancy may be attributed to 
the small size of the Schaefer and Bayley 
samples (Ws about 24). 

It seems reasonable that the major linkage 
between parental malevolence and child’s IQ 
should occur between mothers and daughters 
of this age. The expectation of a stronger 
bond between mother-daughter pairs than 
between other parent-child combinations is 
based upon both the phenomenon of identi- 
fication with same-sex model and the fact 
that in our culture fathers typically spend 
much less time in direct contact with chil- 
dren than do mothers, As the observed dif- 
ferences between these mother-child and 
father-child correlations are small, this dis- 
tinction appears of greater theoretical signifi- 
cance than it is of practical importance. The 
salient point is that malevolent behaviors of 
both mothers and fathers have a significant 
negative association with the child’s IQ. 

Both the reliability restrictions and sam- 
pling limitations probably attenuated the 
magnitude of the Table 2 correlations. The 
internal consistency of the CTMM IQ score 
used is given as .75 (California Test Bureau, 
1957). The odd versus even item correlations 
of both the Pun and Agg indexes, as de- 
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termined from a randomly selected set of 
160 mothers from the present sample and 
corrected for length by the Spearman-Brown 
formula, were .73 and .63, respectively. Cor- 
recting the observed correlations between 
mothers’ scores and daughters’ IQs for this 
unreliability shifts the Table 2 Pun r from 
—.31 to —.42 and the Agg r from —.35 to 
—.51. Thus, the present data suggest that at 
least 17-26% of the IQ differences among 
these girls is theoretically associated with 
variations in maternal malevolence. This 
estimate is only slightly below the 30% 
value noted earlier (Hurley, 1965), using a 
mail-out questionnaire index of Manifest Re- 
jection which correlated .46 (N = 194) with 
the Pun score. Even these substantial associ- 
ations are probably underestimates of the 
true size of this linkage due to the under- 
representation of emotionally disturbed and 
extremely low SES families in these samples. 


Child’s IQ versus Parental Malevolence at 
Different Levels of Parent Education 


Table 3 reveals substantial differences in 
the Pun and Agg scores of parents differing 
in EL, with the less educated manifesting the 
stronger inclination toward malevolent be- 
havior. As anticipated, the children’s IQ 
scores are importantly related to EL. The 
product-moment correlation between the IQs 
of all of these children with fathers’ EL was 
r= .29; for mothers, r = .30. Despite these 
relationships, however, all correlations be- 
tween parent malevolence indexes and child’s 
IQ within the different levels of parent edu- 
cation remained negative. The degree of link- 
age between child IQ and parent malevolence 
varies importantly with parental EL, and this 
association is substantially higher among 
families where parents do not have any col- 
lege education than it is among those families 
where the parents have attended college. 
Appraising the differences of the Table 3 cor- 
relations using Fisher’s 7 to z transforma- 
tion, only the father’s Pun versus child’s IQ 
correlations differed significantly at the .05 
level. In these instances, the correlation of 
the least educated fathers (r = —.31) reli- 
ably exceeded the corresponding correlations 
for both the high-school- and college-educated 
fathers. The difference between the low cor- 
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relation of college-educated mothers? Agg 
scores versus their child’s IQ and the larger 
similar correlations for both groups of less 
educated mothers was barely below the .05 
level. While only 2 of the 12 possible dif- 
ferences in Table 3 correlations reached sta- 
tistical significance, the pattern of greater 
linkage between parental malevolence and 
child’s IQ among the less educated and 
smaller associations among the better edu- 
cated is clear. Only among the 24% of these 
parents who had attended college is this 
relationship of apparently trivial size. 

Probably the college-educated parents’ 
greater sophistication about the socially ac- 
ceptable responses to questions dealing with 
child-rearing practices contributed to this 
lower association. These parents would seem 
more likely than the less educated to conceal 
their employment of malevolent or coercive 
patterns of dealing with children regardless 
of their actual behavior. Thus, even if the 
same relationships between parental malevo- 
lence and children’s IQs existed in this better 
educated group, its identification would prob- 
ably require different approaches than the 
relatively primitive measuring techniques 
used in this study. Kagan’s (1964) finding 
of substantial negative correlations between 
maternal criticism and daughter’s IQ, when 
Parental education was controlled in a sample 
where 70% of the parents attended college, 
Suggests that direct observations of parental 
behavior might yield a less benign impression 
of the better educated parents. 


Does Parental Malevolence Cause Lower 10 
in Children? 


The pattern of present and prior findings 
demonstrate a method-free (mail-out ques- 
tionnaire, parent interview, and ratings of 
Observed behavior) relationship between a 
Series of conceptually related measures of 
parental malevolence, punitiveness, or rejec- 
tion—whichever may ultimately prove to be 
the most appropriate label—and third-grade 
children’s IQ scores. Since correlations can 
only establish associations and afford no di- 
Tect basis for causal statements, a problem 
is posed for the interpretation of these Te- 
sults. One interpretation of these findings 
might be that less intelligent children elicit 
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more malevolent behaviors from parents than 
do more intelligent children. Although con- 
gruent with the general negative relationship 
between malevolence and IQ, this position 
offers no satisfactory explanation for another 
central finding, namely, the stronger bond 
between daughter’s IQ and mother’s malevo- 
lence than between any other parent-child 
pair. The smaller IQ versus malevolence 
correlations, which were more often found 
among the more educated families than 
among the less educated families, also seem 
inconsistent with this viewpoint. It would 
appear likely that the more ambitious upper 
EL parents would experience greater frustra- 
tion and manifest greater malevolence toward 
slow learning children than_would less am- 
bitious, less educated parents. 

The possibility that the present results 
may be attributed to the hidden operation 
of other, as yet unidentified, variables seems 
weak. The salience of the love-hate or 
acceptance-rejection dimension in many inde- 
pendent studies of the parent-child relation- 
ship, which appears so directly linked with 
the parental behaviors measured in this re- 
search, suggests that the present findings are 
to be expected. The recent report by Bayley 
and Schaefer (1964) of correlations between 
children’s IQs and antecedent measures of 
maternal behavior also identified the love- 
hate variable as the most effective predictor 
of boys’ IQs in their investigation. 

These objections to the principal alterna- 
tive interpretations suggest that the view of 
parental malevolence or rejection as an im- 
portant cause of low intelligence in children 
deserves careful attention. In 1945 Baldwin, 
Kalhorn, and Breese linked gains in children’s 
IQs with maternal acceptance or loving be- 
haviors, while maternal rejection or hostility 
was associated with IQ decrements. A more 
recent Fels Institute report (Kagan & 
Freeman, 1963), as corrected by Kagan 
(1964), found that maternal restrictiveness, 
coerciveness, and criticism, determined from 
ratings of observed mother’s behavior when 
these children were between 2 and 7 years 
old, correlated negatively with the same chil- 
dren’s Stanford-Binet IQs at age 9 years. 
At least for mothers and sons, the Bayley 
and Schaefer (1964) findings clearly support 
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both these earlier and the present findings. 
Thus, the interpretation that malevolent 
parental behaviors are causally related to low 
IQ in children seems more defensible. 
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NOTES AND COMMENTS 


USE OF LEARNING CUES WITH THE BENDER VISUAL MOTOR 
GESTALT TEST IN SCREENING CHILDREN FOR NEUROLOGICAL 


IMPAIRMENT 


DONALD C. SMITH AND 
Ohio State University 


ROBERT A. MARTIN 


A study of the comparative ability of neurologically impaired and non- 
neurologically impaired children to utilize learning cues to correct rotations of 
designs on the Bender-Gestalt Test. 25 Ss in each group, matched for age and 
IQ, were administered the Bender Visual Motor Gestalt Test (BVMGT) 
individually. Ss who rotated a design were given a series of learning cues 
designed to correct the rotation. Neurologically impaired Ss made a signifi- 
cantly greater number of rotations and required a significantly greater number 
of cues to correct rotations. Results tended to confirm the clinical impression 
that ability to correct rotations is a more discriminating index of neurological 
impairment than frequency of rotations. Problems in sampling and in defini- 
tion of the construct “neurological impairment” restrict generality of the 


Lancaster City Schools, Lancaster, Ohio 


findings, 


Graphic reproduction of visual stimuli has 
become a popular procedure in screening for 
neurological impairment. With the Bender Visual 
Motor Gestalt Test (BVMGT), signs such as 
rotation, perseveration, and distortion are re- 
ported to be suggestive of encephalopathy 
(Bender, 1938; Griffith & Taylor, 1960; Hain, 
1964). Rotations of BVMGT figures, however, 
occur not only among neurologically impaired 
children, but among children who are emotionally 
disturbed, momentarily distracted, or untrained 
in spatial orientation (Byrd, 1956; Fabian, 1945; 
Koppitz, 1964). Thus, the diagnostic significance 
of rotations by the individual child is equivocal. 

In an evaluation of an article by Keller (1955), 
Ingram (1955) observed that it was often pos- 
sible in clinical practice to differentiate between 
the rotations of brain-injured and “neurotic” chil- 
dren by giving a series of learning aids which 
are increasingly more concrete in nature. Ingram 
employed three learning cues: (a) routine re- 
drawing of the design, following rotation; (b) 
redrawing the design in a rotated position, then 
in the correct position; and (c) pointing to the 
top and the bottom of the design and redrawing. 
Despite these cues, encephalopathic children per- 
Severated their first reponse; some were unable 
to perceive the discrepancy, while others were 
aware of their error but unable to readjust. Emo- 
tionally disturbed children, on the other hand, 
Perceived and corrected their rotational errors 
easily, 3 

One might expect that correction of rotation 
by the simple redrawing of a design would iden- 


tify children who rotate because of carelessness, 
temporary distraction, or inexperience with draw- 
ing. The second cue used by Ingram provides 
additional practice in discriminating between the 
rotated and proper orientation of the figure, and 
the third mediates the response with the verbali- 
zation of the positional concepts, “top” and 
“bottom.” Movement and its perception is a 
necessary prerequisite to visual perception, and 
Goldstein and Scherer (1941) introduced a series 
of learning cues for various tests which were 
based on increasing concreteness in reference to 
body sensations. Hence, further learning cues for 
correcting BVMGT rotations might permit the 
subject to trace or otherwise experience a design 
kinesthetically. 

Based on this rationale, a series of five cues 
was presented to an experimental group of neuro- 
logically impaired subjects and a control group 
of children with behavior or learning problems, 
but no evidence of neurological impairment. It 
was predicted that the experimental group would 
produce a significantly greater number of rota- 
tions and require a significantly greater number 
of cues to correct their rotations. In addition, 
data were analyzed in regard to the following 
questions: Which figures on the BVMGT are 
rotated most frequently by each group? Which 
type of cue provides the greatest number of 
corrections for rotations? 


PROCEDURES 


The experimental group consisted of 25 students 
from the Lancaster, Ohio, public schools diagnosed 
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TABLE 1 
SUMMARY OF CLINICAL DATA ON NEUROLOGICALLY ĪMPAIRED CHILDREN 


EEG No EEG 
Clinical data administered administered Total 
(N = 17) (N =8) (N = 25) 

Clinical syndrome 

Epileptic 6 2 8 

Generalized encephalopathy 4 0 4 

Arrested hydrocephalic 1 2 3 

Chronic brain syndrome (hyperkinetic type) 2 0 2 

Cerebral palsy 0 2 2 
Behavioral signs 

Hyperactivity 11 3 14 

Poor attention span (distractibility) 8 1 9 

Poor motor coordination 5 3 8 

Learning disability 3 1 4 

Impulsivity 3 2 5 

Speech, language disability 3 1 4 

Perseveration in language and written work 1 0 1 
Case history 

Brain injury at birth 2 1 3 

Seizures during infancy 0 1 1 

Spinal meningitis 0 1 1 

Asphyxiation and loss of consciousness 0 1 1 


Note.—Data include only that which was specifically mentioned in the clinical reports from neurologists. Omission of signifi- 
cant data is possible, since they were not asked to report information systematically. 


as encephalopathic by four practicing neurologists. 
A summary of the clinical findings is presented in 
Table 1. Positive electroencephalograms (EEG) plus 
routine neurological examination were the basis for 
diagnosis in 17 of the cases. Eight other children 
were identified as neurologically impaired on the 
basis of medical symptoms associated with a par- 
ticular clinical syndrome, or a combination of be- 
havioral signs and historical data. 

The control group consisted of 25 students from 
the Lancaster, Ohio, public schools who had been 
evaluated by the school psychologist during a pre- 
ceding 2-year period. The children were referred to 
the school psychologist for a number of reasons: slow 
learners (four), emotional problems (three), slow 
learners with emotional problems (two), reading 
problems with average ability (three), academic 
underachievers of normal ability (three), academic 
underachievers of dull normal ability (three), early 
entrance into first grade (three), and social problems 
(one). 

Ideally, each of the control subjects should have 
been examined by a neurologist, but this was impos- 
sible due to temporal and financial considerations, 
Therefore, the absence of neurological impairment 
was inferred by the absence of the following be- 
havioral or developmental signs usually associated 
with neurological impairment: (a) continual and 
uncontrollable hyperactivity; (b) explosive behav- 
ior; (c) lack of ability to concentrate on any topic 
for more than a very short time; and (d) medical 
history of birth injury, high fevers, anoxia, or 


severe head injury. All subjects in the control group 
were regular patients of pediatricians or other quali- 
fied medical practitioners, and none had ever been 
referred for a neurological examination by his 
physician. 

Since the purpose of the study was to observe 
motor reproductions of visual perceptions, it was 
necessary that each subject be able to see and draw 
well enough for administration of the BVMGT. No 
subjects in the experimental or control group had 
to be excluded for this reason. 

Subjects in the two groups were equated for 
chronological age and intelligence. Mental ages and 
IQs for all subjects were derived from the Stanford- 
Binet Intelligence Scale, Form L-M. The mean 
chronological ages of the experimental and control 
groups were 9-1 and 9-0, respectively, and the 
mean mental age and IQ for both groups, 7-11 and 
85. The experimental group included 20 males and 
5 females, and the control group 18 males and 
7 females. 

The BVMGT was administered individually, and 
each subject copied the designs on one 83 X 11 inch 
sheet of paper. Care was taken to assure that the 
long axis of the subject’s paper remained parallel 
to the long axis of the stimulus card in order to 
avoid the possible orientational confusion reporte 
by Hannah (1958). Rotation was defined as devia- 
tion of a design 45° or more from its proper 
orientation. Whenever a rotation occurred, the uae 
ject’s paper was immediately turned, and he wa 
given the first of a series of learning cues in an 
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effort to help him correct the rotation; all designs 
were redrawn on the back of the original sheet of 
paper. The five cues presented were as follows: 


1. The subject was asked to redraw the design. 

2. The stimulus card was oriented to the same 
position as the subject had rotated the design, 
and he was asked to redraw the design; then, 
the stimulus card was reoriented to its proper 
position, and the subject was asked to redraw 
the design. 

3. The subject was asked to point to the top and 
the bottom of the stimulus card and to the 
top and the bottom of his paper, then asked 
to redraw the design. 

4. The subject was asked to trace over the stimu- 
lus design with his fingers, then asked to redraw 
the figure. 

5. The subject was allowed to trace the design 
on a piece of thin copy paper placed over the 
stimulus card, then asked to redraw the 
figure in its original position. 


The series of cues was terminated as soon as 
the subject corrected his rotation of the stimulus 
design. 


RESULTS 


The experimental and control groups were 
divided into two categories: (a) subjects making 
no rotations and (b) subjects making one or 
more rotations on the BVMGT. Chi-square 
analysis indicated that neurologically impaired 
subjects produced a significantly greater number 
of rotations than did nonneurologically impaired 
subjects (see Table 2). 

Subjects in both groups who rotated BVMGT 
figures were divided into two categories: (a) 
those who corrected rotations on one or more 
designs with the first learning cue and (b) those 
who required two or more cues for correction 
(see Table 3). Sixteen of the 21 subjects in the 
experimental group required two or more cues 
for correction. All nine subjects in the control 
group corrected rotations on the first cue. A 
significant relationship existed between neuro- 


TABLE 2 


RELATIONSHIP BETWEEN NUMBER OF ROTATIONAL 
Errors AND NEUROLOGICAL IMPAIRMENT 


One or 
No more 
Group rotations rotations xe 
Experimental 4 (10) 21 (15) —10.08* 
Control 16 (10) 9 (15) 


* With Yates correction for continuity (Fisher, 1948). 

ita Expected frequency on basis of marginal totals in paren- 
leses, 
* Significant at the .01 level. 
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TABLE 3 


RELATIONSHIP BETWEEN NUMBER OF CUES REQUIRED 
FOR CORRECTION AND NEUROLOGICAL 
IMPAIRMENT 


Number of required cues 


Two or 
Group One more xe 
Experimental 5 (9.8)? 16 (11.2) 14.8* 
Control 9 (4.2) 0 (4.8) 


* With Yates correction for continuity (Fisher, 1948), 
b Expected frequency on basis of marginal totals in paren- 
8. 


eses. 
* Significant at the .01 level. 


logical impairment and the number of cues re- 
quired to correct rotations on the BVMGT. 

Table 4 summarizes, for each of the nine de- 
signs on the BVMGT, the total number of 
rotational errors made by each group, the number 
of cues required by each group for correction 
of errors, and the range of cues and the average 
number of cues required by each group for 
correction of rotational errors. Rotations oc- 
curred most frequently on Designs 2, 3, 4, and 5 
with both groups. Design A was rotated by 
three experimental subjects but by none in the 
control group. One subject in each group rotated 
Design 7. Designs 1 and 8 each were rotated 
by one experimental subject and by no control 
subjects, and Design 6 failed to elicit rotations 
in either group. Inspection of Table 4 suggests 
that experimental subjects found Designs 2 and 4 
relatively more difficult to correct and Designs 
3 and 5 relatively easier to correct. 

The control group corrected all errors with 
the first cue. The experimental group corrected 
21 of its 41 errors with the first cue, 7 with the 
second, 6 with the third, none with the fourth, 
and 7 with the fifth. 


Discussion 


Although the neurologically impaired group 
made a significantly greater number of rotations 
than the nonneurologically impaired group, nine 
of the latter also produced rotations. Because 
of the presence of so many false positives, fre- 
quency of rotations alone is, at best, a gross 
criterion for screening neurological impairment. 

Neurologically impaired subjects who rotated 
designs on the BVMGT required a significantly 
greater number of cues to correct their rota- 
tions than subjects in the control group who 
rotated designs. None of the subjects in the 
control group required more than the first cue 
(simple redrawing) to correct rotations, but 16 
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TABLE 4 


SUMMARY OF ROTATIONAL ERRORS AND THE NUMBER OF CUES REQUIRED FOR CORRECTION FOR 
Eaca Ficure on THE BVMGT 


Number of cues 


Range of cues Mean number of 


Number of required for required for cues required for 
Figure rotations correction correction correction 
E Cc E E E C E C 
A 3 0 5 0 13 0 L7 0 
1 1 0 2 0 2 0 2.0 0 
2 7 3 17 3 1-5 1 2.4 1 
3 12 2 19 2 1-3 1 1.6 1 
4 12 2 34 2 1-5 1 2.8 1 
5 4 3 6 3 1-3 1 1.7 1 
6 0 0 0 0 0 0 0 
7 1 uf 1 1 1 1 1.0 1 
8 1 0 2 0 2 0 2.0 
Total 41 11 86 11 
of the 21 experimental subjects needed two or selected on the basis of identical criteria. 


more cues for correction. Thus, results confirm 
Ingram’s (1955) clinical impression that ability to 
correct rotations provides a more discriminating 
index of neurological impairment than frequency 
of rotation. 

Whereas learning cues eliminated false positives 
in the control group, it should be noted that some 
false negatives occurred in the experimental group 
Q.e., five experimental subjects needed only one 
cue for correction). Also experimental subjects 
varied considerably, not only in the number of 
cues required for correction, but in regard to the 
particular figures rotated. Designs A, 2, 3, 4, and 
5 lent themselves to rotation more frequently 
than other figures. Design 6 failed to elicit 
rotation in either group and Designs 1, 7, and 8 
were rotated so infrequently that their usefulness 
in future research is questionable. Ease of cor- 
rection also varied with particular design. De- 
signs 2 and 4 warrant attention. For example, 
5 of the 12 experimental subjects who rotated 
Design 4 needed five cues to correct the rota- 
tion. Future research should explore more fully 
the variability among neurologically impaired 
children in regard to differences in (a) the par- 
ticular designs rotated and (b) the number and 
type of cues which correct rotations. 

Although the results are encouraging, the 
present study should serve more as a stimulus 
for future research than as verification of a 
clinical technique. The most striking limitation 
is the sampling procedure, which, though dic- 
tated by pragmatic considerations, seriously re- 
stricts generalization of the findings. Subjects 
included in the experimental group were not 


Seventeen had been given a neurological exami- 
nation and EEG," but the remainder were iden- 
tified on the basis of behavior signs, case history, 
and/or classification according to clinical syn- 
drome. In contrast, absence of behavioral signs 
of neurological impairment was the sole criterion 
for inclusion of subjects in the control group. 
At the least, it would have been desirable to 
employ common criteria for discrimination be- 
tween subject groups. At best, it may be desir- 
able in the future to avoid the assumption of 
an overall construct of neurological impairment, 
and, instead, search for relationships between 
rotational behavior and other specific correlates 
of brain damage. 


1It is of interest to note that three of the four 
experimental subjects who failed to rotate designs 
were subjects who had not been administered the 
EEG in the selection procedure. 


REFERENCES 


BENDER, L. A visual-motor gestalt test and its 
clinical use. New York: American Orthopsychiatric 
Association, 1938, 

Byen, E. The clinical validity of the Bender-Gestalt 
Test with children: A developmental comparison 
of children in need of psychotherapy and children 
judged well-adjusted. Journal of Projective Tech- 
niques and Personality Assessment, 1956, 20, 127- 
136, 

Fasran, A. A. Vertical rotation in visual-motor per- 
formance: Its relationship to reading reversals. 
Journal of Educational Psychology, 1945, 36, 
129-154. 

Fisner, R. A. Statistical methods for research 
workers. New York: Hafner, 1948. 


NOTES AND COMMENTS 


GorpsteIN, K., & Scmerer, M. Abstract and concrete 
behavior: An experimental study with special tests. 
Psychological Monographs, 1941, 53(2, Whole No. 
239). 

GrErITH, R. M., & Tayor, V. H. Incidence of 
Bender-Gestalt figure rotations. Journal of Con- 
sulting Psychology, 1960, 24, 189-190. 

Ham, J. D. The Bender-Gestalt: A scoring method 
for identifying brain damage. Journal of Consult- 
ing Psychology, 1964, 28, 34-40. 


Journal of Consulting Psychology 
1967, Vol. 31, No. 2, 209-212 


209 


Hannam, L. D. Causative factors in the production 
of rotations on the Bender-Gestalt designs. Journal 
of Consulting Psychology, 1958, 22, 398-399. 

Incram, W. Comment on Keller. American Journal 
of Orthopsychiatry, 1955, 25, 572-573. 

Ketter, J. The use of the Bender-Gestalt matura- 
tion level scoring system with mentally handi- 
capped children. American Journal of Ortho- 
psychiatry, 1955, 25, 563-572. 

Kopprtz, E. M. The Bender-Gestalt Test for young 
children. New York: Grune & Stratton, 1964. 


(Received November 5, 1965) 


BEHAVIOR THERAPY TREATMENT APPROACH TO 
A PSYCHOGENIC SEIZURE CASE 


JAMES E. GARDNER? 
Childrens Hospital, Los Angeles 


A behavior therapy treatment approach was developed for a child manifesting 
psychogenic seizures. Treatment, in the form of 3 weekly counseling sessions 
with the parents only, involved altering intrafamilial reinforcement contingen- 
cies so that the child received parental attention for “appropriate” behaviors 
but not for “inappropriate” behaviors such as seizures, This treatment plan 
resulted in the rapid and complete cessation of seizure behavior. The func- 
tional relationship between the child’s seizure behavior and parental attention 
was demonstrated in a follow-up 26 wk. later. When parental attention was 
purposely reinstated for “inappropriate” behaviors, the child again manifested 
seizure behavior. The child’s seizure behavior once more ceased when parental 
attention was again purposely withdrawn for this behavior. 


Psychotherapeutic technique is based largely 
on one’s conception of the nature of psycho- 
pathology. Traditionally, the prevalent psycho- 
pathological model has been the “symptom- 
underlying disease” notion (Bandura, 1962) in 
which deviant behavior is viewed as a symptom 
of more fundamental psychodynamic forces. A 
consequence of this view is that psychothera- 
peutic treatment tends to focus on inferred in- 
ternalized conflicts which are usually assumed to 
be unconscious. The general therapeutic goal in 
this type of treatment is the modification of emo- 
tional responses through insight and/or catharsis. 

Alternative models of psychopathology and 
treatment have recently been put forth by be- 
havioristically oriented experimentalists and cli- 
nicians (Bandura, 1962; Eysenck, 1960; Skinner, 
1954; Wolpe, 1958). Such models may be placed 
under the general term “behavior therapy” (Ey- 
senck, 1960). Behavior therapy differs mark- 
edly from the symptom-underlying-disease model 
Tegarding its view of how maladaptive behaviors 


1Now at Psychological Center, Los Angeles. 


are established and maintained as well as treat- 
ment approaches to such behavior (cf. Eysenck, 
1960; Ferster, Nurnberger, & Levitt, 1962; 
Skinner, 1954; Wolpe, 1958). 

In general, the behavior therapy model con- 
ceptualizes neurotic or maladaptive behavior as 
learned behavior, not as a manifestation of a 
mental illness. Treatment tends to focus on the 
“unlearning” or “relearning” of specific habit 
patterns and may take place in one or more of 
the following ways: (a) by the direct manipula- 
tion of environmental reinforcement contingen- 
cies (Ayllon & Haughton, 1962; Ayllon & 
Michael, 1959), (b) by direct countercondition- 
ing and/or extinction procedures (Wolpe, 1958), 
or (c) by the provision of more appropriate 
social role models (Bandura, 1963). 

With behavior therapy, as with traditional 
therapy, the type of treatment approach should 
be dictated by the type of problem. In this case, 
direct environmental manipulation of major so- 
cial contingencies within the family was the 
treatment selected for a child manifesting seiz- 
ure behavior of nonorganic origin. 
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SUBJECT 


The subject (S) was a light-complected, at- 
tractive, 10-year-old Negro female. She was the 
oldest of three children, attended a private school 
where she maintained average grades, and was 
considered well liked by peers and teachers. 

On the surface, as judged from background 
material initially taken by an attending physi- 
cian and later elaborated by the therapist, the 
family history appeared unremarkable. The par- 
ents had never been separated or divorced, father 
was noted for steady employment and attention 
to his family, S had never manifested behavioral 
or other problems at school, disagreements be- 
tween sibs at home seemed at about the “usual” 
level, and there was no history of prior seizure 
behavior on the part of any member of the fam- 
ily. 

However, close analysis of the interactions be- 
tween family members suggested some possible 
antecedents to S’s seizure behavior. For example, 
S had long been noted for her rivalry with her 
next younger sister and both children competed 
strongly for parental attention, Also, a “model” 
for psychosomatic behavior had been inadver- 
tently provided by the mother some months 
prior to S’s manifestation of seizure behavior. At 
this time, the mother had experienced a headache 
of such intensity that she “rocked and banged” 
in pain and had to be taken to the hospital. 
Some further antecedents in the form of apparent 
inadvertent parental “shaping” of deviant behav- 
ior also seemed in evidence. This will be ampli- 
fied below. However, it must be noted that this 
apparent shaping for deviate behavior on the part 
of the parents seemed well within the bounds of 
the type of situation that could relatively easily 
arise in a family (eg., giving in to tantrum 
eae giving attention for somatic complaints, 
etc.). 

According to the parents, S had for some 
weeks prior to her seizures manifested increas- 
ingly frequent somatic complaints, The parents 
could not recall whether the increase in somatic 
complaints closely followed the mother’s illness 
or not. The S had also manifested several tem- 
per tantrums during the several weeks prior to 
the seizure episode, one of which, in the mother’s 
words, “looked sort of like a convulsion.” 

The seizure episode which resulted in S’s hos- 
pitalization began mildly with complaints of a 
stomachache followed by a headache. These 
complaints were followed by rhythmical head 
rolling accompanied by hair pulling. The S was 
then taken to the hospital by her parents. 

In the hospital, the results of all physical 


tests, including an electroencephalograph study, 
were either negative or ambiguous. Psychologi- 
cal test results suggested a “hysteric-type” per- 
sonality but no indications of severe emotional 
difficulty or neurological impairment. However, 
it was noted that S seemed to feel a high degree 
of sibling rivalry. It was also noted that a hys- 
teric-type personality might be considered high 
in potential for some form of conversion reaction 
and/or somatic complaint. 


PROCEDURE 


Psychological consultation was initiated with S's 
parents, prior to S’s discharge from the hospital, The 
S received no counseling either during or after her 
hospitalization. In order to attempt to assess more 
clearly the effects of the parental counseling on S’s 
seizure behavior, the usual seizure-inhibiting medi- 
cations were withheld under medical supervision. 

The parents were seen jointly in three weekly 1- 
hour sessions. The first session was conducted prior 
to S’s discharge from the hospital. When S returned 
home, a treatment plan devised in the first session 
was immediately put into effect by the parents. 

In the counseling sessions, the emphasis was im- 
mediately placed on (a) analyzing the reinforce- 
ment contingencies within this particular family’s 
structure and (b) devising means of altering some 
of these contingencies in an attempt to alter $’s 
deviant behavior with a minimum of friction. 

The parents seemed to readily grasp the behav- 
ioral principles involved. Whenever possible, they 
Were encouraged to develop or elaborate aspects of 
the treatment plan. Whenever one or the other 
verbalized some plan which seemed fairly likely to 
be effective in this situation, further discussion was 
encouraged by the therapist. In this manner, an 
attempt was made to “shape” the parent’s behavior 
toward more effective ways of dealing with S’s devi- 
ant behavior. This procedure seemed to be more 
effective than direct suggestion, since the parents de- 
veloped the notions themselves in the context of 
what was feasible in their home situation. 

The relatively simple treatment plan was devised 
in the first session with the parents. The two later 
sessions were spent in elaborating or clarifying as- 
pects of the initial basic treatment plan. The pro- 
gram consisted of the parents (a) being “deaf and 
dumb” whenever § manifested seizures or other 
highly deviant behavior such as tantrums, (b) re- 
warding S with their attention whenever S mani- 
fested appropriate behavior such as playing with 
sibs, helping mother, drawing, etc., while (c) being 
alert for possible substitute behavior on S’s part, such 
as increased somatic complaints (which, if mani- 
fested, were to also be dealt with using the “deaf 
and dumb” method of nonreinforcement). 

Follow-up telephone interviews were conducted 
approximately once every 2 weeks for 30 weeks. 
The frequency of S’s somatic complaints, tantrums, 
and seizures during this period was recorded from 
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data furnished by the parents. The importance of 
this information was stressed to the parents, and it 
appeared that they attempted to comply with the 
requests for accurate observations. This method of 
estimating the frequency of the selected behaviors 
was obviously imprecise. However, laboratory con- 
trols were not possible under the circumstances. 
Also, ethical and practical considerations militated 
against delaying treatment for S until a more ade- 
quate baseline could be obtained, although the need 
for some estimate of the prehospitalization level of 
S’s seizure and related behaviors .was realized, In 
order to approximate such a baseline, the parents 
were requested to estimate the frequncy of S’s so- 
matic complaints, tantrums, and seizures for the 
period 2 weeks prior to her hospitalization, The 
parents estimated S’s somatic complaints to be about 
6 to 8 per week, and tantrums about 5 to 6 per 
week in frequency, Using this initial estimation, and 
the subsequent weekly follow-up parent reports, 
data was obtained which appeared adequate enough 
to reflect any major trends in the frequency of S’s 
deviant behaviors. ) 

At the 26-week follow-up interval, the parents 
were instructed to reinstate attention for S’s deviant 
behaviors, including seizures if such should occur. 
This was done in order to attempt to demonstrate 
the functional relationship between S’s seizure be- 
havior and parental attention, as well as to clarify 
the differential diagnosis between psychogenic and 
organic seizures, 


RESULTS 


Within 2 weeks of S’s discharge from the hos- 
pital and the concomitant institution of the treat- 
ment plan as originated in the parent-counseling 
sessions (a) the frequency of S’s seizure behav- 
ior dropped to zero, (b) S’s tantrum behavior in- 
creased in frequency to about three per week 
(but did not rise to the estimated prehospitaliza- 
tion level of five to six per week), then dropped 
out altogether within the month, and (c) S’s so- 
matic complaints gradually increased again to a 
frequency of about three per week, a level about 
half that of the pretreatment level as estimated 
by the parents.? 

In the 26th week of follow-up, the parents 
were instructed to deliberately reinstate attention 
for S’s somatic complaints as well as tantrums 
and seizures should such reappear. They were 
also instructed to return to the original “deaf and 


2Subsequent to this case, the author informally 
checked the frequency of somatic complaints of 
children of various ages as reported by their par- 
ents. Data from five families showed the median 
number of somatic complaints per child to be a 
little over four per week. This evidence, though 
neither systematic nor from a large or random sam- 
ple, suggests at the very least that S’s three per week 
baseline of somatic complaints is probably not sus- 
Piciously high. 


dumb” treatment for deviant behaviors once 
such behaviors had reappeared. 

Within 24 hours of the deliberate reinstate- 
ment of parental attention for S’s somatic com- 
plaints, this class of behavior showed a sharp 
increase to about one per hour. Then S mani- 
fested a seizure. As instructed, the parents then 
returned to the initial treatment plan of rein- 
forcing appropriate behavior while ignoring devi- 
ant behavior, Subsequent to the parents’ rein- 
statement of these contingencies, S manifested 
no more seizure behavior, two temper tantrums, 
and a rise in somatic complaints to a frequency 
of seven per week for the week following the 
deliberate reinstatement of the pretreatment con- 
tingencies, The somatic complaints then de- 
creased gradually to about three per week, as 
before. 


DISCUSSION 


One of the principal consequences of the as- 
sumption that maladaptive behaviors are learned 
is that an individualized treatment plan must be 
developed. In the present case, in the absence of 
clear evidence regarding an organic basis for the 
seizures, S’s seizure and other deviant behaviors 
were regarded as learned behaviors established 
and maintained by their consequences, in this 
case, the obtaining of parental attention. 

A treatment plan involving the manipulation 
of reinforcement contingencies in S’s home en- 
vironment was formulated. This was accomp- 
lished by the parents who altered their responses 
to S’s appropriate as well as inappropriate be- 
havior. The parents were assisted in such modi- 
fication by the aid of three counseling sessions. 

Conceptually, the development of the seizure 
behavior in this case can be viewed as a func- 
tion of differential reinforcement (shaping) for 
deviant behavior over a period of time. The 
parents, during the three counseling sessions, 
noted that they reacted with natural and appro- 
priate concern for any somatic complaints mani- 
fested by any of their children. However, the 
increasing frequency of such complaints from S, 
with no apparent physical basis, gradually tended 
to “desensitize” the parents to such complaints, 
and they began giving less attention to them. 
Shortly after the parents had virtually ceased 
reacting to S’s somatic complaints, S manifested 
a head-banging, hair-pulling tantrum which suc- 
ceeded in quickly eliciting much parental atten- 
tion of a rather positive nature (i.e., concern, 
solicitousness). It appeared that there may have 
been several such adjustments and readjustments, 
with the cycle ultimately resulting in S’s mani- 
festation of seizure behavior. 
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On the other hand, an alternative viewpoint 
might be that the seizures were only incidentally, 
if at all, related to the reinforcing effects of 
parental attention, and that their cessation was 
simply a physiological remission. That such re- 
mission was not the case was shown in the fol- 
low-up phase 26 weeks after the initial counsel- 
ing session with the parents, At this point, in 
order to demonstrate that S’s seizure behavior 
was a function of parental attention as well as to 
more clearly establish the differential diagnosis 
between organic and functional aspects of the 
seizures, parental attention was again made con- 
tingent primarily upon deviant behavior. 

Since no seizure activity was being manifested 
by S at that time, the parents were instructed 
to reinforce, with attention, S’s somatic com- 
plaints. The need for this phase of the program 
in terms of clarifying the case, was explained to 
the parents, and they were in full agreement that 
it should be carried out. 

Within 24 hours of the deliberate initiation 
of this parental reinforcement program, S mani- 
fested seizure behavior. The parents had been 
forewarned that $ might again manifest seizure 
behavior under these conditions. As instructed, 
they then once more withdrew attention for all 
deviant behaviors while, as before, concomitantly 
giving much attention to S for more appropriate 
behaviors. 

Conceptually, when the parents reinstated their 
concern regarding S’s somatic complaints, this 
acted as a cue that the whole class of such be- 
haviors (i.e., somatic complaints, tantrums, seiz- 
ures, etc.) were now functional again with re- 
gard to the obtaining of parental attention. It is 
not suggested that the reappearance of such be- 
haviors was volitional or necessarily even con- 
scious on S’s part. Rather, it is suggested that 
specific behaviors may be a function of specific 
stimulus circumstances, in this case largely ex- 
ternal stimulus circumstances emanating from 
parental behavior. 

The demonstration of the functional relation- 
ship between parental attention and S’s seizure 
behavior was made possible by using S as her own 
control. For ethical and/or practical reasons this 
is not always feasible in a clinical setting. How- 
ever, the baseline regarding the frequency of 
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somatic complaints, tantrums, and seizures iņ 
this case seemed to reflect the effect of the treat- 
ment program. Because of this, a specific hy- 
pothesis relating the effect of parental attention 
on S’s seizure behavior could be assessed. 

Further follow-up interviews at the 28- and 
30-week intervals revealed that S had mani- 
fested no further seizure behavior. A follow-up 
call to the mother 1 year after S’s hospitaliza- 
tion revealed that S had manifested no seizure 
behavior in the interim. The mother also re- 
ported at this time that $ seemed happy and 
well adjusted both at home and at school, that 
her grades were good, and that she seemed less 
competitive with her younger sister. 

The primary implications of the above find- 
ings appear to be that (a) maladaptive or devi- 
ant behavior patterns may, in some cases, be a 
function of consequences of a socially reinforc- 
ing nature such as the control or manipulation 
of others, (b) it is possible to rearrange rein- 
forcement contingencies even without complete 
environmental control, and (c) the systematic 
alteration of parental behavior can be a potent 
and efficient force in altering the maladaptive 
behavior of a child. 
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CHARACTERISTICS OF PARTICIPANTS AND NONPARTICIPANTS 
IN INDIVIDUAL TEST-INTERPRETATION INTERVIEWS 


MORRIS TAGGART 
Garrett Theological Seminary 


The study compared students who rei 


battery of tests with those who did 


sample (N= 96) was compared with 
scores on the Minnesota Multiphasic 


Theological School Inventory (TSI), 


logical Test (OSPT). Only the MMP’ 


quested an interpretive interview after a 
not request an interview. An interview 
a noninterview sample (N =35) using 


Personality Inventory (MMPI), the 


and the Ohio State University Psycho- 
I comparisons yielded significant differ- 


ences between the 2 groups. The interview sample had lower scores on Scales 
K and R (Welsh), and higher scores on Scales 2 (D), 5 (Mf), and A (Welsh), 
The interview sample also had a higher mean grade-point average than the 
noninterview sample. An explanation of some of the observed differences was 
offered in terms of differences among Ss along the introvert-extrovert con- 


tinuum. 


Where batteries of tests are routinely given, 
for example, as part of a registration procedure 
for entering students, it is common practice to 
invite test takers to participate in subsequent 
interpretive interviews. People differ in their 
response to such an invitation. The question is 
one of discovering if this difference is related: to 
other differences, especially with regard to the 
test data themselves. 

The characteristics of the volunteer in psycho- 
logical research have been a topic for investiga- 
tion (Martin & Marcuse, 1958). There has also 
been interest in discovering what it is that in- 
duces people to seek personal counseling (Ter- 
williger & Fiedler, 1958). Whether or not to 
Participate in an interpretive interview would 
appear to be a combination of these two situa- 
tions, 

The present study proposed to examine the 
Personality characteristics of participants and 
honparticipants in test-interpretation interviews. 
Specifically, the following hypotheses were pro- 
Posed: Participants have less defensiveness, de- 
fined in terms of K scores on the MMPI, than 
Nonparticipants; participants have a higher tend- 
ency to report themselves as feeling anxious, de- 
fined in terms of scores on Welsh’s A factor 
(Welsh, 1956); participants are less likely to 
Tesort to denial of symptoms, defined in terms of 
Scores on Welsh’s (1956) R factor. 


METHOD 


All entering students at Garrett Theological Semi- 
nary in the year 1964-65 took a battery of tests as 
Part of their registration procedure. The battery in- 
cluded the Minnesota Multiphasic Personality In- 
ventory (MMPI), the Ohio State University Psy- 


chological Test (OSPT), and the Theological 
School Inventory (TSI; Dittes, 1964). During the 
testing session, held at the beginning of each quar- 
ter, the students were informed that opportunity for 
interview regarding the tests would be available for 
those who wished it. A reminder of the invitation 
was sent out by mail a few weeks later to those who 
had not already scheduled interviews. No further 
attempt was made to encourage students to attend 
interpretive interviews, It was discovered that, oc- 
casionally, a student’s faculty advisor would recom- 
mend a student to take the initiative in setting up 
an interview. In no case did a faculty advisor have 
access to test scores, The resulting two groups, an 
interview sample (V = 96) and a noninterview sam- 
ple (N= 35), became the basis of the investigation. 


RESULTS 


The results are summarized in Table 1. The 
OSPT comparisons yielded no significant differ- 
ences between the two groups. There were sig- 
nificant differences between the groups on K, A, 
and R, and all in the predicted direction. The 
difference with respect to R was significant at 
the .05 level only, and might be better described 
as suggestive. Further differences between the 
groups appeared on D and Mf. The interview 
sample had significantly higher grade-point aver- 
ages than the noninterview sample. The TSI com- 
parisons yielded no significant differences, and 
the means are not reported in the interest of 
space. 


Discussion 


The specific hypotheses regarding differences 
between the groups on K, A, and R were upheld, 
although the data are perhaps only suggestive in 
the case of R. 
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Notes AND COMMENTS 


TABLE 1 


MEAN Scores OF INTERVIEW AND NONINTERVIEW GROUPS 


Interview sample 


Noninterview sample 


Scale NES MES 
M SD M SD t 
OSPTI 22.11 4.51 21.80 4.70 ns 
OSPT II 40.97 9.52 38.69 9.21 ns 
OSPT IT 41.10 9.17 39.37 8.88 ns 
OSPT total 104.18 20.24 99.86 16.33 ns 
MMPI 
L 4,19 2.09 4.49 1.76 ns 
F 2.74 2.15 2.66 2.35 ns 
K 17.71 4.28 19.63 4.21 025" 
Hs» 12,19 3.49 11,91 3.10 ns 
D 18.26 4,47 16.54 2.73 05 
Hy 21.94 3.89 22.34 3.32 ns 
Pd 22.10 3.61 23.03 3.55 ns 
Mf 31.65 6.17 28.77 5.95 02 
Pa 9.79 2.46 10.00 2.15 ns 
Ph 27.28 5.03 26.49 5.65 ns 
Sc 25.88 4.94 26.57 5.88 ns 
Ma 19.41 5.41 20.31 6.61 ns 
Si 23.88 8.14 21.63 7.29 ns 
A 8.53 6.60 4,83 3.77 018 
R 16.50 4,25 18.03 3.97 05" 
Es 48.52 5.02 49.91 7.58 ns 
Grade average 1.77 65 1,44 .67 .02 


* One-tailed test of significance. 
» These scale scores include K corrections. 


The interpretation of K as a measure of de- 
fensiveness would appear to be reasonable in the 
setting of the study. The other common view of 
K is that it measures a social desirability set on 
the part of the respondent. If such a response set 
had behavioral implications outside the test situ- 
ation, we might expect students with high K 
Scores to seek an interview with the faculty per- 
son responsible for the testing program. In fact, 
students with lower scores on K were more likely 
to do so. 

It would appear that part of the motivation 
for seeking test interpretations is a greater cog- 
nitive awareness of anxiety, reflected in higher 
A scores. This finding corresponds to that of 
Terwilliger and Fiedler (1958) who found that 
anxious and less self-satisfied individuals are 
more likely to seek therapeutic help than are 
those who do not express such feelings. As hap- 
pens from time to time, the desire for a test in- 
terpretation is the first step in seeking thera- 
peutic help. 

That lower scores on the R factor are related 
to participation in interpretive interviews is con- 


sistent with the view of R as a measure of the 
tendency to deny or rationalize symptoms. Fac- 
tor-analytic studies of the MMPI (Corah, 1964) 
have suggested that Welsh’s R is identical with 
the introversion-extroversion continuum. Ey- 
senck (1960) hypothesized that extroverted neu- 
rotics will show a preponderance of somatic 
symptoms of anxiety while introverted neurotics 
will be characterized more by cognitive symptoms 
of anxiety.. This hypothesis was confirmed by 
Corah (1964) and the finding may have implica- 
tions for the present study. It could be argued 
that introverted students, that is, those with low 
R scores, feel their emotional upsets cognitively 
and look to the test interpretation to help them 
handle such feelings. On the other hand, the 
extroverted person may, when emotionally up- 
set, tend to produce somatic symptoms for which 
he can take medication or go to the student health 
service. Unfortunately, it was impossible to 
gather data on the number of visits to the stu- 
dent health service made by the subjects of this 
study, but in any replication such data would be 
valuable. 


Notes AND COMMENTS 


The finding of significant differences between 
the groups on D and Mf is not such that plausi- 
ble rationales immediately suggest themselves. It 
may well be that depression is the most easily 
recognized form of cognitive anxiety, and there- 
fore high D scorers are likely to seek interviews. 
The Mf scores are somewhat confused in any 
case since they contain both men’s and women’s 
scores, 

There is no clear reason as to why the inter- 
view sample should have had higher grade-point 
averages than the noninterview sample. Terwilli- 
ger and Fiedler (1958) found that students who 
felt the need for therapeutic help tended to have 
slightly better grades than those who did not 
approach the counseling service. It could be con- 
tended that a test interpretation is sufficiently 
therapeutic to allow a student release from his 
more crippling anxieties, freeing him to do bet- 
ter academically, but the present study did not 
provide data to test this view. 
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EXTROVERSION AND STIMULUS-SEEKING MOTIVATION 


FRANK FARLEY? 
Institute of Psychiatry, University of London 


AND 


SONJA V. FARLEY 
Carnegie Library, Lambeth 


From Eysenck’s personality theory and recent notions of stimulus-seeking 
behavior, it was predicted that extroversion would correlate with stimulus 
seeking as measured by the Sensation Seeking Scale (SSS) of Zuckerman, 


Kolin, Price, 


and Zoob (1964). The correlation found was .47, p < .01. It 


was concluded that this result was consistent with previous findings and that 
it aids in establishing the construct validity of the SSS. 


In a recent paper Zuckerman, Kolin, Price, 
and Zoob (1964) reported the development of 
an objective questionnaire measure of sensation 
seeking—Sensation Seeking Scale (SSS)—in an 
attempt to quantify psychometrically the con- 
struct of optimal stimulation. This was a forced- 
choice factored scale of 34 items, with 4 items 
scored for males, 8 for females, and 22 for both 
sexes. They reported preliminary validation for 
the scale in terms of a positive correlation with 
field independence as measured by the Embedded 
Figures Test—field independent subjects (Ss) 
More responsive to internal sensations than es 
dependent ones—and a negative correlation wi 


1Now at the University of Wisconsin. 


anxiety, as sensation seeking involves an enjoy- 
ment of tension-raising situations. 

Eysenck (1963) has summarized the evidence 
supporting the view that because of the hypothe- 
sized greater inhibitory potential of the extro- 
vert as compared to that of the introvert 
(Eysenck, 1960), the extrovert will seek 
arousal-producing stimuli so as to maintain some 
optimum level of “arousal potential” (Berlyne, 
1960), whereas introverts, with a hypothesized 
high excitatory potential, will attempt to avoid 
arousal-producing stimuli. This conception has 
received some experimental confirmation (Ey- 
senck, 1963), though the amount of relevant 


research is not extensive. 
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The present paper reports on the relationship 
between extroversion and stimulus seeking as 
purportedly measured by the SSS. 

One hundred male Ss were administered the 
SSS and the Eysenck Personality Inventory 
(EPI; Eysenck & Eysenck, 1964). Sixty-eight 
of the Ss were apprentices at a motor works 
near London, and the remaining 32 were paid 
volunteers for a series of experiments at the 
Institute of Psychiatry. These latter Ss were 
largely civil service employees. Only the 26 
items on the SSS which loaded .30 or higher 
for males on the major sensation-seeking factor 
established in the scale construction (Zuckerman 
et al., 1964) were used, Extroversion was meas- 
ured by the extroversion (Æ) scale of the EPI, 


RESULTS 


The mean SSS score was 14.57, SD = 3.93. 
The mean Æ score was 15.14, SD = 4.36. 
The product-moment correlation between the 
SSS‘ and the Æ scale was .47, p < .01. An item 
analysis was done between the 26 SSS items 
and the Æ scores. Of the 26 items, 10 had cor- 
relations with Æ significant at or beyond the 
.05 level. The alternatives chosen by extroverts 
on each of these 10 forced-choice items are pre- 
sented below, using the original item numbering 
of Zuckerman et al. 


7. I like to explore a strange city or section 
of town by myself, even if it means get- 
ting lost. 

9. I would like to try some of the new drugs 
that produce hallucinations. 

11. I sometimes like to do things that are a 
little frightening, 

17. I would like to take off on a trip with 
no preplanned or definite routes, or time- 
tables. 

21. I would like to have the experience of 
being hypnotized. 

22, The important goal of life is to live it to 
the fullest and experience as much of it as 
you can, 

23. I would like to try parachute jumping. 

26. I prefer friends who are excitingly un- 
predictable. 

28. I often find beauty in the “clashing” 
colors and irregular forms of modern 
paintings. 


Notes AND COMMENTS 


33. When I feel discouraged I recover by 
going out and doing something new and 
exciting. 

Discussion 


Certainly each of the 10 items listed above 
reflects stimulus-seeking motivation and prefer- 
ence for sensory variability. This fits with the 
high risk taking of the extrovert (Lynn & Butler, 
1962), his more frequent alternation behavior 
(Eysenck & Levey, 1965), greater alcohol and 
cigarette consumption (Eysenck, 1963), greater 
extent of physical movement (Rachman, 1961), 
less stimulus-deprivation tolerance (Petrie, Col- 
lins, & Solomon, 1960), and greater pain toler- 
ance as compared with introverts (Lynn & 
Eysenck, 1961). 

The present results aid in establishing the 
construct validity of the SSS as a measure of 
stimulus seeking, in that the correlation with 
extroversion could be deduced from a theo- 
retical framework of stimulus-seeking behavior 
from which other related predictions had been 
confirmed. 
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FACTOR ANALYSIS OF THE COLLEGIATE WAIS+ 


DALE J. SHAW 
State Hospital, Jamestown, North Dakota 


Cohen’s (1957) factor analysis of the WAIS 
standardization population resulted in a well- 
known three-factor solution. The present study 
determines the factorial structure of the WAIS 
in the collegiate population and compares this 
structure to that found by Cohen in his more 
general analysis. 

The subjects (Ss) were 100 college students 
from a major midwestern university who were 
evenly divided for sex. Ten different “majors” 
were included, and all academic levels were 
represented. Variables included in the analysis 
were age, sex, academic level, and WAIS sub- 
test scaled scores. Communalities were estimated 
as the squared multiple r of each variable with 
the others. The matrix was factored by the 
method of principal axes and rotated to oblique 
simple structure. Three minimally correlated 
factors were identified. 

Factor I replicates Cohen’s (1957) broad 
“verbal comprehension” factor. Factor II is 
similar to Cohen’s “perceptual organization” 
factor but seems to include a definite psycho- 
logical element as well. The three WAIS sub- 
tests which identify this factor are all obviously 
timed tasks, demanding in nature, and usually 
viewed by the testee as difficult. All are apt to 
arouse negativistic or defeatist attitudes that are 
productive of hopelessness, impotence, and poor 
performance. Thus, persons doing well on tests 
included in this factor might be those with the 
ability to make realistic appraisals of their skills 
and act in a calm, confident, efficient fashion 
when confronted with difficult problem-solving 
situations, Low scorers, on the other hand, might 
best be thought of as persons lacking this psycho- 


1An extended report of this study may be ob- 
tained without charge from Dale J. Shaw, State 
Hospital, Jamestown, North Dakota or for a fee 
from the American Documentation Institute. Order 
Document No. 9202 from ADI Auxiliary Publica- 
tions Project, Photoduplication Service, Library of 
Congress, Washington, D. C. 20540. Remit in ad- 
vance $1.25 for microfilm or $1.25 for photocopies, 
and make checks payable to: Chief, Photoduplica- 
tion Service, Library of Congress. 


logical element rather than those with perceptual 
or organizational deficits. 

Factor III is clearly artifactual and reflects 
only feminine superiority in perceptual speed. 

Cohen’s third factor, “freedom from distrac- 
tion,” did not appear in this analysis. Perhaps, 
in the collegiate population, where every member 
is engaged in the demanding task of acquiring a 
college education, this factor is absent because 
of the relatively equal ability of all subjects to 
remain free of distraction. 

The WAIS factorial structure in the collegi- 
ate population seems sufficiently different. from 
that in the general adult population to justify 
specific interpretive principles. Further, the 
nature of Factor II in this group suggests that 
scores on this factor might possess concurrent 
or predictive validity with achievement and 
motivation, 


TABLE 1 


ROTATED Factor MATRIX 


Variable I IL TL 
Age REE ES T] 
Sex —05 =31 48 
Academic level 52 00 02 
Information 47 14 05 
Comprehension 52 06 12 
Arithmetic 12 53 —10 
Similarities 50 i7 H 

igit 32 1; 
Vea 59 15 19 
Digit symbol 12 —02 45 
Picture completion 28 09 01 
Block design 05 51 —09 
Picture arrangement 06 21 -19 
Object assembly 09 41-01 
REFERENCE 


Conen, J. The factorial structure of the WATS be- 
tween early adulthood and old age. Journal of 
Consulting Psychology, 1957, 21, 283-290, 
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SOME INTELLECTUAL CORRELATES OF SCHIZOID INDICATORS: 


WAIS AND MMPI: 


B. THOMAS HARWOOD 
Marshall University 


Various patterns of WAIS scores have been 
used as diagnostic indicators in clinical practice 
(Payne, 1961; Rabin, 1965; Wechsler, 1958). 
Although the weight of the evidence favored the 
existence of differences between diagnostic 
groups, these differences were of limited value 
in clinical practice. However, these differences 
did have theoretical and heuristic value. Previ- 
ously these trends and indicators were demon- 
strated using subjects with known psychopathol- 
ogy. These data were difficult to generalize to a 
normal population. One purpose of this research 
was to provide data which generalized more 
readily. The positive difference between the 
Verbal IQ and the Performance IQ as related to 
schizophrenia was the specific area investigated. 

All full-time incoming freshman students of a 
4-year state college were given the MMPI. From 
a group of 533 freshman male students, profiles 
were selected with the schizophrenic scale (Sc), 
the highest of nine clinical scales. Forty-nine 
male students had a peak score on the Sc scale. 
Fifty profiles were selected at random from the 
remaining 484 profiles to be used for control 
purposes. Profiles were eliminated which did not 
meet MMPI validity requirements. The subjects 
in the Sc group had to have scored a minimum 
T score of 60 on the Sc scale to be included. 
The profiles were identified, but only their names 
and addresses were known by the experimenter 
until all testing was completed. The final Ns 
were 23 for the Sc group and 28 for the control 
group. 

The range of Verbal-Performance scores for 
the Sc group was —15 to +26. The range of 


1 An extended report of this study may be ob- 
tained without charge from B, Thomas Harwood, 
Department of Education, Pago Pago, Tutuila, 
‘American Samoa 96920, or for a fee from the 
American Documentation Institute. Order Document 
No. 9203 from ADI Auxiliary Publications Proj- 
ect, Photoduplication Service, Library of Congress, 
Washington, D. C. 20540. Remit in advance $1.25 
for microfilm or $1.25 for photocopies, and make 
checks payable to: Chief, Photoduplication Service, 
Library of Congress. 


Verbal-Performance scores for the control group 
was —20 to +23, The median was +2 for each 
group. Chi-square analysis revealed no significant 
difference. 

The mean IQ scores of the Sc group were 
Verbal 1Q, 105.39; Performance 1Q, 100.52; and 
Full Scale IQ, 103.57. The difference between 
the Verbal IQ and Performance IQ was statisti- 
cally significant (D = 4.87, t= 2.03, p<.05, 
one-tailed). The mean scores of the control 
group were Verbal IQ, 109.25; Performance IQ 
107.86; and Full Scale IQ, 109.00. The difference 
between the Verbal IQs of each group was not 
significantly different, but the difference between 
the Performance IQs was significant (D = 7.34, 
t=2.42, p<.05, one-tailed). As the variances 
for the Full Scale IQ were not homogeneous, 
the Mann-Whitney U test was used to test 
the difference (D = 5.43, U = 233.5, z= 1.68, 
p<.10, two-tailed). 

These results do not controvert the thought that 
persons with schizoid characteristics differ from 
the general population in intellectual function- 
ing, but the overlap was so large that these 
differences were of limited value in a clinical 
setting. However, the control and Sc groups 
were drawn from a fairly homogeneous, normal 
population and were equated for age, Sex, and 
years of education. Thus, the small differences 
noted assumed added significance. These data 
suggested that further research into the dif- 
ferences in intellectual functioning correlated 
with personality types would be fruitful. 
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BIRTH ORDER AND STUDENT CHARACTERISTICS: 


A REPLICATION * 


C. EUGENE WALKER 


anp JAMES TAHMISIAN 


Westmont College 


Considerable interest has generated regarding 
the relationship between birth order and other 
variables. One study reported in this journal 
(Altus, 1965) utilized as subjects students from 
the University of California at Santa Barbara 
(UCSB). The general finding of this study was 
that firstborns tend to be overrepresented in 
this population, and that there is a tendency 
for firstborns to have higher verbal and mathe- 
matical scores on the Scholastic Aptitude Tests 
(SAT) of the College Entrance Examination 
Board, though statistical significance was actually 
reached only in the case of verbal scores for 
females. 

The present authors felt that presentation of 
these data afforded an excellent opportunity to 
replicate the Altus study on a different but 
comparable population of students. The data 
presented in the present article were obtained 
from students of Westmont College—a college 
of approximately 600 students, located in Santa 
Barbara, California, and supported by a number 
of protestant denominations of which the Baptist 
faith predominates. The basic quality and back- 
ground of the students is roughly equivalent to 
that of UCSB with the exception that virtually 
all of the students come from homes in the 
conservative protestant religious tradition and 
have chosen to attend a private, religiously 
oriented college. 

The general hypothesis investigated was that 
the same overrepresentation of firstborns would 
be evident in our sample and that the same 


1An extended report of this study may be ob- 
tained without dars from C. Eugene Walker, 
Chairman, Division of Education and Psychology, 
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tendencies toward superior ability would prevail. 
The birth order for each entering student at 
Westmont College in the fall of 1965 was deter- 
mined, and his SAT scores obtained. In addition, 
scores on the Repression-Sensitivity (R-S) and 
Marlowe-Crowne Social Desirability (M-C) 
Scales were available and were included in the 
analysis. Due to the small sample size, it was 
decided that only one gross comparison could 
appropriately be made. Therefore, the firstborn 
and only child categories were combined and 
compared with second and third borns (fourth- 
and later-born categories were eliminated due to 
small numbers in each). The final W utilized was 
142. The data were analyzed by use of ¢ tests 
and can be summarized as follows. 

The SAT scores for firstborn females were 
significantly higher on both the verbal and mathe- 
matical sections than the scores for later-born 
females. There was some tendency (not sta- 
tistically significant) for the firstborn males to 
have higher verbal ability than later-born males. 
The male SAT mathematical ability means were 
almost identical (531.49 versus 532.08) which 
agrees well with the Altus data in which they 
were identical. There were no significant results 
or trends in the R-S or M-C data. That the first- 
borns were overrepresented in these data can be 
seen in the fact that of the 142 subjects, 58% 
were in the firstborn category (51% if the only 
child is excluded), The data reported are in 
perfect agreement with the original data pre- 
sented by Altus with the exception that, whereas 
the female SAT math scores showed only a 
trend toward being higher for firstborns in his 
study, they were found to be significantly higher 
in the present data. The present data may thus 
be considered a confirmation of the Altus study. 
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RELATIONSHIP BETWEEN MARITAL STATUS AND THE 
PERSONALITY OF MOTHERS OF DISTURBED CHILDREN: 


GEORGE H. DUNTEMAN anp WILLIAM D. WOLKING 
Medical Center, University of Florida 


A recent study by Loeb and Price (1966) 
found that divorced and separated mothers of 
children brought to a child guidance clinic scored 
significantly higher than married mothers on the 
Pd, F, Sc, and Ma scales of the MMPI. One 
purpose of this study was to determine whether 
the mean MMPI profiles of the divorced and 
separated mothers of emotionally disturbed chil- 
dren were significantly different from those of 
the married mothers of emotionally disturbed 
children. A second purpose was to examine the 
relationship between the marital status of the 
mothers and the incidence of child-behavior 
disorders. 

The MMPI was administered routinely to the 
parents of children brought to the Division of 
Child Psychiatry of a midwestern metropolitan 
medical center. From the total sample of 534 
mothers with MMPI profiles, the MMPI pro- 
files of the 44 divorced or permanently sepa- 
rated mothers were compared with the MMPI 
profiles of a random sample of 44 continuously 
married mothers. The analysis was based upon 
the L, F, K, and 10 basic clinical scales of 
the MMPI. The mean ages for the married 
mothers and for the separated and divorced 
mothers were 36.26 and 37.13, respectively. 
There was one Negro in each sample. A two- 


1An extended report of this study may be ob- 
tained without charge from George H. Dunteman, 
College of Health Related Professions, University of 
Florida, Gainesville, Florida 32601, or for a fee from 
the American Documentation Institute. Order Docu- 
ment No. 9204 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington, D. C. 20540. Remit in advance 
$1.25 for microfilm, or $1.25 for photocopies, and 
make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 

This research was supported in part by Grant RD- 
1127 to the Regional Rehabilitation Research Insti- 
tute, University of Florida, from the Vocational Re- 
habilitation Administration, Department of Health, 
Education, and Welfare, Washington, D. C. 


group linear discriminant-function analysis was 
conducted and was tested for significance by an 
F approximation (cf. Rao, 1952). The ¢ tests 
for the 13 scales were computed to determine 
if the present results would cross-validate the 
findings of Loeb and Price. 

The F test indicated that the discriminant 
function was not significant at the .05 level. 
However, the ¢ tests indicated that the Pd scale 
would significantly differentiate between these 
groups at the .01 level. The means for the mar- 
ried group and the separated and divorced group 
were 56.1 and 62.5, respectively. No other scales 
were significant at the .05 level. A chi-square 
test indicated that the frequency of behavior 
disorders between the two groups was not sig- 
nificantly different at the .05 level. The fre- 
quencies of behavior disorders in the two groups 
were compared because of Loeb and Price’s find- 
ings that the children from disrupted homes were 
more frequently rated as aggressive. 

Loeb and Price (1966) obtained their most 
significant difference between the divorced and 
separated mothers and married mothers of emo- 
tionally disturbed children on the Pd scale. 
However, they also found significant differences 
on the F, Ma, and Sc scales while the present 
study only found insignificant trends in these 
directions. The present study indicated a trend 
in which the children diagnosed as behaviorally 
disordered come from disrupted homes, but this 
finding was not statistically significant. These 
particular findings do not support those obtained 
by Loeb and Price. 
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WISC OBJECT ASSEMBLY AND BODILY CONCERN * 


G. JAMES Ri 


University 


Blatt, Allison, and Baker (1965) studied the 
Wechsler Object Assembly (OA) subtest and 
bodily concern and remarked: 


The results of the study clearly support the hypoth- 
esis that performance on the OA subtest of the 
Wechsler intelligence scales is susceptible to inter- 
ference by concerns and preoccupations about body 
intactness, 


The section of the study in which children were 
employed as subjects is open to criticism on sev- 
eral grounds and the conclusions are at least in 
part unwarranted. 

In the first place, the sample contained a very 
small W, seven in the bodily concern and six in 
the control group. It would appear that before 
any broad generalizations are made, a larger N 
and a cross-validation are in order. A second 
criticism might be aimed at the selection pro- 
cedure, Although the cases were selected by an 
independent clinician who did not know the hy- 
pothesis being tested, there was no mention of 
whether he made his selection randomly, or if he 
matched subjects on some criterion, It would be 
more appropriate to use groups matched at least 
on Full Scale IQ before conclusions are drawn 
since Blatt et al, (1965) remarked: 


The somewhat lower IQ. of the bodily concern 
group, however, may have exaggerated the differ- 
ences on the OA subtest. 


1An extended report of this study may be ob- 
tained without charge from G. James Rockwell, 
Jr., 222 Child Development, University of Minne- 
sota, Minneapolis, Minn. 55455 or for a fee from 
the American Documentation Institute. Order Docu- 
ment No. 9206 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington, D. C. 20540, Remit in advance 
$1.25 for microfilm or $1.25 for photocopies, and 
make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 

The author is indebted to Britton K. Ruebush for 


editorial assistance. 


OCKWELL, JR. 


of Minnesota 


To test the hypothesis that OA subtest is in- 
fluenced by bodily concern somewhat more rig- 
orously, although restricted in generality to males, 
a further study using boys of normal or above 
intelligence matched on IQ was undertaken, 

The records of 30 boys of average or above- 
average intelligence who had taken the WISC as 
part of their diagnostic study were selected from 
the files of the Child Development Clinic at the 
University of Minnesota. The files of each boy 
who had been administered the WISC were read, 
and those 15 containing a statement that the boy 
evidenced bodily concern were placed in the 
experimental group. A sample of 15 subjects 
whose files contained no statement about bodily 
concern were matched with the bodily concern 
group on WISC Full Scale IQ and served as a 
control group. 

The groups did not differ on mean age, on OA, 
or on any other WISC subtests (all p’s >.1), 
nor were there differences when the deviation of 
OA subtest of each boy was compared to his 
total subtest mean. There were no significant 
differences on any of the individual items of the 
OA subtest (again all p’s >.1). In all analyses, 
t tests for dependent samples were used, 

The results differ from those of Blatt et al. 
(1965) in that they do not support the hypoth- 
esis that performance on the OA subtest of the 
WISC is susceptible to interference by concerns 
about bodily intactness. Care in interpretation 
of disruptions on OA as being due to body con- 
cern seems indicated. 
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TIME ESTIMATION IN PROCESS-REACTIVE SCHIZOPHRENIA ! 


CHERYL J. NORMINGTON 
Colorado State University 


Although temporal distortions are believed by 
many writers to be quite common in schizo- 
phrenia, as has been indicated in the literature 
review by Wallace and Rabin (1960), empirical 
studies of time perception in schizophrenia have 
given contradictory results. In these studies, 
schizophrenia has been treated as an entity. 
Schizophrenics, as a group, have not been found 
to be homogeneous, however, and the variation 
noted has led to the development of a process- 
reactive classification. 

In the present research, the process-reactive 
classification was used in studying time estima- 
tion in schizophrenia. Process and reactive schizo- 
phrenics, and reactive schizophrenics and normals 
were compared as to their responses on a per- 
ceptual time-estimation task. Null hypotheses 
being tested were that groups would not differ in 
variability or accuracy of response. 

Schizophrenic (45) and normal (15) subjects 
were selected from the male patient population 
and hospital employees, respectively, of a Vet- 
eran’s Administration hospital. The pool of pa- 
tients from which the schizophrenics were drawn 
consisted of patients who had appeared before 
the hospital diagnostic staffs within the past 18 
months and been diagnosed as schizophrenic, 
Clinical histories of these patients were screened; 
those with a history of brain damage or evidence 
of mental deficiency were excluded. 

Schizophrenics selected for the study were 
classified by means of the Abbreviated Becker 
Elgin Scale (ABES). The 15 having the highest 
ABES scores were assigned to the process group; 


1 An extended report of this study may be ob- 
tained without charge from Cheryl Normington, 
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payable to: Chief, Photoduplication Service, Library 
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the 15 with the lowest scores composed the re- 
active group. Two-rater reliability was established 
as being .94 (rank-order correlation). 

Comparisons by means of the chi-square test 
showed no significant differences between nor- 
mals, process schizophrenics, and reactive schizo- 
phrenics as to race or IQ score (as measured by 
an individually administered “screening” test of 
intelligence). Nor were significant differences 
found between process and reactive schizophren- 
ics for the additional variables of type of ward 
(open or closed) or subtype of schizophrenia. 

Subjects were individually administered a time- 
estimation task consisting of seven stimulus 
cards which were tachistoscopically presented, 
with exposure speeds at 10, 20, and 30 seconds 
for each card. Each subject viewed 21 presenta- 
tions with random card order and presentation 
time. Judgments of exposure times were con- 
verted into scores based upon ratios of estimated 
time to actual time. Scores (three) for each card 
were totaled. Groups were then compared on each 
of the stimulus cards by means of two-tailed ¢ 
tests. 

Process schizophrenics portrayed significantly 
greater variability than did reactives on five of 
the seven cards, while reactives showed signifi- 
cantly more variability than normals on four of 
the seven cards and significantly less variability 
than normals on one of the cards, 

In accuracy of estimation, process schizophren- 
ics demonstrated significantly less accuracy than 
did reactives on four of the seven cards; no dif- 
ferences were shown between reactives and nor- 
mals in accuracy of estimation. 

Findings suggest that the use of a classification 
system for obtaining more homogeneous group- 
ings of schizophrenics is needed in studies of 
time estimation. The occurrence of contradictory 
findings might well be lessened in this way. 
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RELATIONSHIP BETWEEN REPORTED AND OBSERVED 
DOMINANCE AND CONFLICT AMONG PARENTS OF 
SCHIZOPHRENICS + 


DOMENIC V. CICCHETTI 


Veterans Administration Hospital, West Haven, 
Connecticut 


The results of well-controlled studies indicate 
no significant differences among the mothers of 
schizophrenics and those of controls in the 
amount of reported dominance. However, several 
equally well-controlled studies demonstrate that 
the parents of schizophrenics behave in a more 
conflictual manner toward each other than is 
true of the parents of controls. 

The present study was designed to illuminate 
this discrepancy by obtaining attitudinal and be- 
havioral measures of both parental dominance 
and conflict. 

The subjects for this study were 35 sets of 
Caucasian parents: 11 were parents of hospital- 
ized poor premorbid schizophrenics (Poors) 
(Phillips, 1953), 12 were parents of hospitalized 
good premorbid schizophrenics (Goods), and 12 
were parents of hospitalized tubercular patients 
(controls), The groups did not differ in age or 
educational level. The parents were asked indi- 
vidually whether they agreed or disagreed with 
the dominance and conflict items from the Pa- 
rental Attitude Research Instrument (PARI, 
Schaefer & Bell, 1955), They were then asked 
to resolve, individually then jointly, 12 hypo- 
thetical child-rearing problems. The parental dia- 
logues were all tape-recorded and later scored 
for dominance and conflict. The dominance in- 
dexes were the frequency of: speaking first, 
speaking last, passive acceptance of the spouse’s 
solution, and yielding. The conflict indexes were 


1 An extended report of this study may be ob- 
tained without charge from Domenic V. Cicchetti, 
Psychology Service, West Haven VA Hospital, West 
Haven, Connecticut, or for a fee from the American 
Documentation Institute. Order Document No. 9277 
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the frequency with which the parents: spoke 
simultaneously, interrupted each other, disagreed 
or aggressed against each other, failed to agree, 
and total amount spoken. 

Consistent with the research, literature, there 
were no reliable differences ret domi- 
nance patterns among the three groups. More- 
over, reported dominance was not significantly 
related to dominant behavior, either by groups or 
by parents within groups. 

However, the mothers of Poors reported more 
conflict than the mothers of controls (p < .05), 
and the latter reported less conflict than their 
spouses (p<.05). In addition, the correlation 
between reported and observed conflict ap- 
proached significance for the mothers of Poors 
(r= .54; p <.10) and was highly significant for 
the fathers of controls (r=.85; p< .005). 
Within parental groups, the mothers of Poors 
tended to be more reliable informants of conflict 
(r =.54) than their spouses (r= .31; p= .08), 
whereas the fathers of controls were much better 
informants (r=.85) than their wives (r= .60; 
p <.005). Further comparisons revealed that the 
fathers of controls gave more reliable estimates 
of conflictual behavior (r = .85) than did either 
the fathers of Goods (r= .37; p < .05) or the 
fathers of Poors (r=.31; p=.07). 

The results suggest that parents in general are 
not reliable judges of dominant behavior, but 
appear to be better estimators of conflictual be- 
havior. The latter finding is especially marked 
for the parents of controls. 
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The present study considers a problem arising 
when a measure of patient process derived from 
Rogers’ process equation, the Experiencing 
(EXP) scale, was applied to recordings of a 
large number # individual therapy sessions 
with hospitalized schizophrenic patients (Rogers, 
Gendlin, Kiesler, & Truax, 1967). When a meas- 
ure of amount of silence was obtained from an 
early therapy interview for each of the therapy 
cases, a moderately negative, but statistically 
insignificant, relationship was found between this 
amount of silence and the level of EXP over the 
whole of therapy. This suggested that the more 
verbal patients (or more verbal interactions) 
received higher EXP ratings, and that the EXP 
scale might be more parsimoniously explained as 
a measure primarily of verbal productivity 
rather than depth of self-exploration. 

The present study looked at this problem in 
detail by comparing the EXP ratings for seg- 
ments of therapy interviews from normal, neu- 
rotic, and schizophrenic patients available from 
a previous study (Kiesler, Klein, & Mathieu, 
1965) with Interaction-Chronograph measures 
(Saslow, Matarazzo & Guze, 1955) of patient 
and therapist speech patterns obtained from the 
same segments, 

The basic data consisted of recordings of 1 
individual therapy hour from each of 24 cases, 
including 8 hospitalized schizophrenics, 8 neu- 
rotics from the University of Chicago Counseling 
Center, and 8 normal subjects (volunteers from 
a Wisconsin Grange organization). The inter- 
views were sampled by extracting 8-minute seg- 
ments of 5 nearly consecutive portions of each 
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of the 24 recorded interviews, yielding a total of 
120 8-minute segments. 

The 120 8-minute segments were first pre- 
sented in a random order to four judges for EXP 
ratings. Ebel intraclass reliability of judges means 
(ree) was .85. 

Interaction-Chronograph (IC) measures of 
patient and therapist verbalization rates and 
verbal interactional variables were obtained for 
the same 120 8-minute segments by means of a 
modified Saslow and Matarazzo IC procedure. 

The following results emerged: (a) Pearson 
correlations between EXP ratings and the IC 
variables revealed little evidence that EXP rat- 
ings are systematically influenced or biased by 
patient or therapist formal speech patterns; (b) 
when the covariance of each IC variable was 
partialled out, the EXP differences obtained for 
the original unadjusted EXP scores were not sig- 
nificantly altered. 

It seems safe to conclude that for the sample 
used in this study, differences in self-explora- 
tory verbalization as measured by the EXP scale 
are independent of the formal communication 
factors tapped by the Saslow and Matarazzo IC 
measures. 

This finding is consistent with the conceptuali- 
zation of the EXP scale as a measure of pa- 
tient-verbalization quality, independent of fac- 
tors such as the length and specific content of 
the therapy interaction. 
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FURTHER INVESTIGATION OF THE EFFECTS OF SUBLIMINAL 
AGGRESSIVE STIMULATION ON THE EGO FUNCTIONING 


OF SCHIZOPHRENICS 


LLOYD H. SILVERMAN t anp ROBERT H. SPIRO 
Manhattan Veterans Administration Hospital 


This was a study of the effects of subliminally presented aggressive stimuli on 
the ego functioning of schizophrenics, 40 hospitalized male Ss were seen for an 
experimental and control session in a balanced design. Measures of pathological 
thinking, accuracy of recall, and projection of aggression were obtained after 
the subliminal presentation of aggression-related and neutral stimuli. In re- 
sponse to the experimental condition both paranoid and nonparanoid patients 
produced significantly more pathological thinking; only the paranoids reacted 
with a significant increase in projection of aggression, and only the non- 
paranoids manifested a significant impairment in accuracy of recall. This data 
was seen as offering further support for the view that the disturbing effects 
of drive stimulation can be studied through the subliminal presentation of 


drive-related stimuli. 


In a series of recent papers (Silverman, 
1965b, 1966; Silverman & Goldweber, in 
press; Silverman & Silverman, 1964; Silver- 
man & Spiro, 1966), an experimental method 
has been described for studying the effects 
that the activation of drive derivatives has on 
ego functioning. Drive-related and neutral 
pictorial stimuli have been presented. tachisto- 
scopically at a subliminal level, and the re- 
actions to each have been sought immediately 
afterward. The overall finding has been that 
after the drive stimuli various kinds of 
pathological reactions and defensive processes 
appeared which were not in evidence after 
the neutral pictures. It has been reasoned 
that the occurrence of these phenomena was 
enhanced by, if not dependent on, the presen-_ 
tation of the drive stimuli in subliminal form. 
Data from two recent experiments (Silverman 
& Goldweber, in press; Silverman & Spiro, 
1966) support this contention. The sudden 
press of drive derivatives that are triggered 
by such stimuli cannot be attributed to an 
external source, as they would be if the 
pictures were shown supraliminally. Thus, 
direct discharge of these derivatives is more 
apt to be blocked, a condition which, as has 
been pointed out elsewhere (Silverman, 


1 Also at New York University. The authors wish 
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in the tachistoscopic procedures. 
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1965a), increases the likelihood of a patho- 
logical outcome. 

In one of the earlier studies (Silverman, 
1966), the effects that aggressive stimuli 
had on the thinking of schizophrenics as 
revealed in a Rorschach task were examined, 
The main dependent variable under consider- 
ation was the amount of pathological thinking 
manifested, that is, thinking that is illogical, 
unrealistic, and loose, which in psychoanalysis 
is conceptualized as falling under the domi- 
nation of the primary process. Each of 32 hos- 
pitalized patients was seen on separate days 
for an experimental and control session. First, 
a “baseline” measure of the schizophrenic’s 
propensity for this kind of thinking was 
obtained ‘much as it would be in a psycho- 
diagnostic situation. Then after subliminal 
exposure of an aggressive stimulus on one 
occasion and a neutral stimulus on the other, 
a second or “critical” measure was obtained. 
In line with what had been predicted, patho- 
logical thinking was found to increase sig- 
nificantly under the aggressive condition. This 
finding was seen as consistent with theoretical 
formulations that have been offered by a 
number of writers to the effect that much 
of the ego disturbance in schizophrenia is 
the result of an inability to successfully cope 
with aggressive impulses (Bak, 1954; Cohen, 
1954; Hartmann, 1953; Pious, 1949).? 

2 It should be noted, however, that in two studies 
(Silverman, 1965b; Silverman & Goldweber, 1966) 
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One purpose of the current investigation 
was to attempt a replication of both the gen- 
eral finding that when a drive-related stimu- 
lus registers subliminally it can alter ego 
functioning and the specific finding that 
pathological thinking in schizophrenia inten- 
sifies in reaction to aggressive stimulation. 
Another purpose was to determine if patho- 
logical thinking or other kinds of ego dys- 
function would become manifest in tasks 
more structured and hence less pathology- 
inducing than the Rorschach, namely, in a 
word-association test and a test of immediate 
memory. In both of these, unlike the Ror- 
schach, the test stimuli are unambiguous, and 
in the memory test the type of response re- 
quired of the subject (S) is highly struc- 
tured as well. Thus, these tests provide an 
opportunity for determining if the experi- 
mental manipulation used in the Rorschach 
study would still prove to be distruptive in 
less stressful circumstances. 

A final aim of this study was to investigate 
differences in the reactions of paranoid and 
nonparanoid schizophrenics to the same ag- 
gressive stimulation. A number of writers 
have cited evidence which highlights the im- 
portance of separating schizophrenics along 
this dimension when carrying out research 
(Johannsen, Friedman, Leitschuh, & Ammons, 
1963; Shakow, 1962; Silverman, 1964). 
Moreover, with regard to the earlier Ror- 
schach experiment, a recently reported 
(Silverman, in press) post hoc breakdown of 
the Ss into paranoid and nonparanoid groups 
clearly revealed the relevance of this dimen- 
sion for the present work. All 17 of the 
schizophrenics who carried chart diagnoses 
of a nonparanoid subtype manifested note- 
worthy disruption after the experimental ma- 
nipulation, 13 of them manifesting a notable 
increase in pathological thinking, the other 
4 showing other kinds of disorganization. On 
the other hand, only 9 of the 15 with clinical 


it has been demonstrated that it is not only schizo- 
phrenics who react to subliminal aggressive stimula- 
tion with pathological thinking. Certain kinds of 
nonschizophrenics, namely those with a relatively 
impaired ability to neutralize aggression, also can 
respond in this way. However, in contrast to the 
schizophrenics, such a reaction in these nonschizo- 
phrenics has been found to be contingent on a prior 
experimental arousal of blatantly aggressive ideas. 
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diagnoses of paranoid schizophrenia mani- 
fested disorganization of any kind. However, 
each of the remaining six paranoids responded 
with either an increase in guardedness or the 
attribution of aggressively threatening quali- 
ties to the Rorschach images which were not 
in evidence after the neutral stimulus. That is 
to say, the paranoid schizophrenics who did 
not manifest increased disorganization after 
the aggressive stimulation did show an in- 
crease in paranoid manifestations. This find- 
ing not only highlighted the importance of 
obtaining information regarding subtype di- 
agnosis, but also argued for including among 
the dependent variables one that would reflect 
increases in paranoid expression after the 
experimental manipulation. With this in 
mind, the following experiment was designed. 


MerHop 
Subjects 


There were 40 male Ss, all of whom were patients 
at the Northport Veterans Administration Hospital. 
Each carried a hospital diagnosis of schizophrenia 
without organic involvement. On the basis of psychi- 
atric evaluation, 19 of these patients were sub- 
classified as paranoid schizophrenics and 21 as non- 
paranoid schizophrenics. The Ss ranged in age from 
29 to 52 years, with the median age of 43 years. 
They had been hospitalized for from 1 year to 25 
years, the median being 16 years, Their median edu- 
cational level was 8 years, and they ranged in this 
respect from 5 to 14 years. 


Stimuli and Tachistoscope 


As in the earlier experiment with schizophrenics 
(Silverman, 1966), the Ss were randomly divided 
into two groups, one of which received aggressive 
and neutral pictures of humans and the other, pic- 
tures of animals. For the first group, the aggressive 
stimulus was of a menacing looking man with a 
dagger in his upraised hand, and the neutral stimulus 
was of a man reading a newspaper. For the other 
group, the aggressive picture was of a lion with 
teeth bared, charging, and the neutral picture was 
a bird with wings spread, alighting. These stimuli 
were shown through an electronically controlled 
mirror tachistoscope, the S looking through an eye- 
piece at a blank field, with the stimuli exposed from 
a second field. The viewing distance was 49 inches, 
and the surface brightness of a white card for the 
intensity setting used was 32 footlamberts. 


Procedure 


Each S served as his own control and was seen 
for two sessions, In each session there was a baseline 
and critical assessment made of the S’s functioning, 
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with the change from the former to the latter re- 
flecting the effect of the particular tachistoscopic 
stimulation (aggressive or neutral) for that particu- 
Jar day. The following is a description of the three 
tasks which yielded the dependent variables. 

Story recall. This consisted of two taped passages 
for each baseline and critical period, the immediate 
recall of which was requested from the S. Each of 
the passages (eight altogether) contained a simple 
and brief narrative account of some more or less 
bland event, the stories having been made up for 
use in this experiment by the authors. 

Word-Association test. Four of the forms de- 
veloped by Moran (1959) were used, one for each 
of the baseline and critical series. Each form con- 
sists of 20 words to which the subject was asked 
to respond in usual word-association fashion. 

Faces test. This was innovated by the authors to 
assess projection of aggression. The S was given the 
following instructions: 


I am going to show you eight pictures of 
foreigners, that is, not persons like you and me, 
but men who live in foreign countries. [Calling 
them foreigners was intended to foster projec- 
tion.] As you know some foreigners are pleasant, 
but some are unpleasant. I would like you to 
examine each picture carefully and judge how 
this particular man appears to you. Use this seven 
point scale I am putting before you and place 
each person somewhere on the scale from “most 
pleasant” to “most unpleasant.” 


The pictures in each baseline and critical series con- 
sisted of an interspersing of cutouts of rather bland 
looking businessmen from the New York Times and 
faces of much more expressive looking figures 
selected from among the Szondi (1947) pictures. 

The second author administered the procedure to 
all the Ss. In the first session, the S was introduced 
to the experiment in the following way: 


I am doing psychological research here in the 
hospital and am trying to learn as much as pos- 
sible about the patients who are here. Your name 
was given to me by the nurse in your building 
as someone who is highly cooperative and willing 
to help. Thus, I am going to ask you to engage 
in some tasks that will give me some of the 
information that I am looking for, The first 
thing I am going to ask you to do is look 
through the eye-piece of the machine next to you 
and you will see some flashes of light. Please put 
your eyes against the eye-piece. I will say “ready 
get set” and then press a button. Then you tell 
me what you have seen such as “a flash of light,” 
or anything else that appears. 


The S then was given four exposures of a neutral 
picture at 15-second intervals. Each exposure was 
for four milliseconds and was preceded by the words, 
“ready get set.” This neutral picture was different 
from the control stimulus and was either of a 
butterfly, shown to those Ss who were later to get 
animal pictures, or a serious looking man with arms 
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TABLE 1 
SUMMARY OF PROCEDURE 


. 4 flashes of neutral stimulus. 

. Baseline story-recall test (2 passages). 
Baseline faces test (8 pictures). 

. 4 refresher flashes of same neutral stimulus. 
Baseline word-association test (20 words), 


4 flashes of aggressive or control stimulus. 

Critical story-recall test (2 passages). 

Critical faces test (8 pictures). 

. 4 refresher flashes of same aggressive or 
control stimulus. 

. Critical word-association test (20 words). 


OPNA weep 


n 
=) 


at sides, shown to those Ss who were later to get 
pictures of humans. These “baseline flashes” were 
included so that the tasks that were to follow would 
be under the same general conditions as those that 
were to come in the critical series, Then the S was 
instructed as follows: 


Now I am going to play a short passage on this 
tape recorder. I would like you to listen carefully 
and when it is over, give back as much of the 
passage as you can remember, 


A passage was played and the S’s recall was recorded 
verbatim. Then a second passage was played, and 
the recall was obtained on that. This was followed 
by the faces test and the word-association test, 
which completed the baseline series. 

The S was asked to look into the tachistoscope 
again, and the same procedure was repeated as had 
taken place earlier, except that a different stimulus 
was shown. For half of the Ss, this was one of 
the aggressive stimuli, and for the other half it 
was one of the control stimuli. The experimenter 
then administered to the S two more passages for 
recall, another series of eight faces for pleasant- 
unpleasant judgments, and another form of the 
word-association test. 

The procedure is summarized in Table 1, It 
can be noted that S was exposed to additional 
tachistoscopic exposures before the word-association 
test of the critical series. The intent was to revive 
the effect of the critical stimulus that had been 
flashed on at the beginning of the critical series, It 
can be further noted that there were similar revival 
flashes of the baseline picture before the word- 
association test of the baseline series. This was 
intended to make the baseline and critical series as 
similar in procedure as possible. 

Session II took place at least 2 days later with 
a procedure similar to that of Session I, A second 
baseline and critical series of tests was administered. 
The baseline series was preceded by the tachisto- 
scopic presentation of another neutral stimulus, 
two dogs for the Ss shown animals and another 
serious looking man for the Ss shown humans, 
Similarly, the second critical series was preceded 
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by either an aggressive stimulus or a control stimu- 
lus, whichever was not shown in Session I. Again, 
there were refresher flashes before the word- 
association test of the baseline and critical series. 

Half of the Ss receiving the human stimuli 
and half of those receiving the animal stimuli were 
shown the aggressive picture before the critical series 
of the first session and the control picture before 
the critical series of the second. For the other half 
the sequence was reversed. For each session both 
stimuli that were to be flashed on were inserted 
into the tachistoscope beforehand by an assistant, 
according to a code list of which only she had 
knowledge. The experimenter who administered the 
various tests and worked the tachistoscope was 
“blind” with regard to whether a session was 
experimental or control, 

The exposure time for all tachistoscopic presenta- 
tions was 4 milliseconds, In the previous work with 
the same hospital population (Silverman, 1966) the 
Ss not only were unable to recognize any aspect 
of either picture at this exposure, but also could 
not consciously discriminate any difference between 
the aggressive and control stimuli in a “discrimina- 
tion task” administered at the end of the experi- 
ment. To check on the “subliminality” of the 
exposure for the current sample, each of the 40 Ss 
was given the same task at the end of this experi- 
ment. In this task, which is described in detail 
elsewhere (Silverman, 1966, p. 107), the S was shown 
his experimental and control stimulus for 10 trials 
in random order under the same tachistoscopic 
conditions as existed during the experiment proper 
and he was asked simply to tell them apart. 


Scoring of Responses 


Four scores for each baseline and critical series 
were obtained, two from the story-recall task and 
one each from the word-association and faces tests. 
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Story-Recall accuracy score. This consisted of 
words (exclusive of prepositions and articles) that 
were correctly recalled from the two stories of the 
particular series, 

Story-Recall pathology score. This score was the 
sum of ratings given to the two stories in the series 
for the amount of pathological thinking that ap- 
peared (irrespective of the amount of material 
accurately recalled). A 10-point scale was used for 
these ratings. “Pathological thinking” referred both 
to the intrusion of material which was not in the 
original story and the organization of the recall, 
that is, how confused or otherwise poorly organized 
it was. In giving this rating, the total length of 
recall was taken into account so that for stories 
containing equal amounts of pathological thinking, 
the shorter the total recall, the higher the pathology 
rating. 

Word-Association pathology score. This consisted 
of the total number of deviant responses given. In 
this instance, pathological thinking referred to loose 
and otherwise distant associations, clang associations, 
repetitions of the stimulus word with no further 
verbalization, the failure to give a response, blocking 
before verbalizing, and multiword responses. 

Faces test projection of aggression score. This 
score consisted of the sum of the S’s ratings of the 
eight pictures in each series for degree of un- 
pleasantness.’ 

Since the two pathology scores were based on 
rater judgments, their reliability was sought by 
having a second scorer blindly rate the protocols for 
half of the Ss. The reliability coefficients yielded 
were .73 for story-recall pathology and .95 for word- 


®The authors will provide on request copies of 
the three tests, detailed criteria for rating pathology 
on the story-recall and word-association tests, and 
copies of the tachistoscopic stimuli. 


TABLE 2 
MEAN CHANGE SCORES FOR EXPERIMENTAL AND CONTROL SESSIONS FOR TOTAL SCHIZOPHRENIC GROUP 


Experi- 
mental Control Difference 
session session between 
7 change change change 
Variable score score scores SDaitt t p 
Word-association 
pathological thinking 2.63 98 1.65 4.76 2.19 025 
Story-recall pathological 
thinking 1,93 38 1.55 3.71 2.66 01 
Total pathological thinking 1.11 35 76 1.51 3.18 002 
Story-recall accuracy -10 15 65 10.20 40 ns 
Faces test projection of 
aggression 1.03 -6 1.15 6.27 1.16 ns 
Note.—N = 40. 


a p values here and elsewhere are for a one-tailed test unless otherwise indicated, 
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association pathology, both being within the accept- 
able range for scorer reliability. 

Treating each of the four scores separately, the 
score for each baseline series was subtracted from 
the score for the corresponding critical series, pro- 
ducing two “change scores,” one for the experimental 
and one for the control session. Since the major 
expectation of change was for pathological think- 
ing, the variable that showed the effect of the 
experimental manipulation in the previous study 
with schizophrenics (Silverman, 1966), an “overall 
pathological thinking” change score, also was com- 
puted for each session, after first converting each 
of the two distributions of pathology scores into 
standard scores. The hypothesis was that the various 
change scores would be greater for the experimental 
than the control session. 


RESULTS AND DISCUSSION 


Table 2 contains the mean change scores 
for the experimental and control session for 
each of the dependent variables. For patho- 
logical thinking, both the separate scores and 
the combined scores were significantly higher 
under the experimental condition. For story- 
recall accuracy and projection of aggression, 
the results although in the predicted direction, 
were not significant. Thus, the hypothesis was 
confirmed for the pathological-thinking mea- 
sures only. It can be added parenthetically 
that as was the case in the previous experiment 
(Silverman, 1966), both the animal and the 
human aggressive stimuli produced more 
pathological thinking than their controls, and 
there was no difference in the effectiveness of 
the two kinds of experimental stimuli (¢ < 1). 

Before discussing the above findings and 
presenting the data on the differential re- 
actions of the paranoid and nonparanoid 
schizophrenics, let us look first at the results 
of the discrimination task which bear on the 
question of whether the presentation of the 
aggressive stimuli in the experiment can be 
considered subliminal. If one considers either 
8 or more correct responses or 8 or more 
incorrect responses out of the 10 trials as a 
nonchance performance ($ < .10, two-tailed 
test), only three Ss met this criterion. Two 
were correct in their judgments eight times 
and incorrect two times. The other was cor- 
rect twice and incorrect eight times. Since 40 
Ss were seen, these three findings can be 
attributed to sampling error. Moreover, all 
three of these Ss were carefully questioned. 
They gave no indication of having seen any 


229 


content of either stimulus. They maintained 
that they attempted to discriminate on the 
basis of whatever slight differences in the 
intensity or coloring of the flashes they could 
detect. It also should be noted that when the 
40 Ss were taken as a unit there was virtually 
no difference between the number of correct 
and incorrect discriminations made (202 to 
198), and the ratios of “hits” to “misses” 
were normally distributed. For 14 Ss, there 
were five correct and five incorrect discrimi- 
nations; for 9 Ss, six and four; for 10 Ss, 
four and six; for 2 Ss, seven and three; for 
2 Ss, three and seven; for 2 Ss, eight and 
two; and for 1 S, two and eight. In light of 
these findings it can be maintained that, as 
in our earlier experiments, impairment in 
ego functioning was brought about by a 
drive-related stimulus that registered at a 
subliminal level. 

Let us now consider the specific effects of 
the experimental manipulation. Table 2 re- 
veals that the expectation that a subliminal 
aggressive stimulus would lead to an increase 
in ego disturbance has been supported as far 
as pathological thinking is concerned, the 
aspect of ego pathology that showed exacerba- 
tion in the earlier study (Silverman, 1966). 
It can be concluded that when schizo- 
phrenics are stimulated by an aggressive 
stimulus out of awareness, this type of think- 
ing can be observed in tasks of varying de- 
grees of structure. While it previously showed 
itself during the highly unstructured Ror- 
schach, in the current experiment pathological 
thinking appeared during both the somewhat 
more structured word-association test and the 
highly structured story-recall test. 

Let us turn next to the issue of the dif- 
ferent reactions of the paranoid and non- 
paranoid schizophrenics. After the data from 
the current experiment were collected, the 
work of Julian Silverman (1964) was noted. 
This investigator has pointed out that the 
paranoid-nonparanoid dimension often inter- 
acts with the chronic-acute dimension in 
determining the results yielded in various 
tasks and implies that when paranoids and 
nonparanoids are to be compared, whenever 
possible, the chronicity factor should be held 
constant. Therefore, a determination was 
made of the status of the current Ss with 
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regard to length of hospitalization. Using 
Silverman’s (1964) criteria, 35 Ss could 
be classified as “chronic” (i.e., hospitalized 
for a total of 6 or more years) and only 
5 “acute” (less than 3 years of hosiptali- 
zation). None of the Ss fell in the area 
between these two extremes, which is con- 
sidered the unclassifiable part of the range, 
There were too few acute patients to con- 
sider them separately, but a comparison could 
be made within the chronic group between 
17 who had been subclassified as nonparanoid 
schizophrenics and 18 subclassified as para- 
noid schizophrenics. ¢ tests on the differences 
between experimental and control session 
change scores were carried out separately for 
each of these groups for the dependent 
variables described earlier. 

The pertinent data are reported in Table 3. 
For pathological thinking, the paranoids and 
nonparanoids differed little, both groups re- 
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acting to the subliminal aggressive stimula- 
tion with an increase in their total score. It 
can be noted, however, that the word- 
association score chiefly contributes to this for 
the paranoids, while for the nonparanoids the 
overall pathological-thinking effect is carried 
by the story-recall score. No explanation for 
this difference offers itself at present. 

On the other hand, for story-recall accu- 
racy and projection of aggression, the two 
measures that did not show an effect for the 
total sample, the two groups reacted in a 
strikingly different manner. The nonparanoids 
manifested a significant loss of accuracy on 
story recall, while the paranoids were not 
affected. Conversely, the paranoids manifested 
a significant increase in projection on the 
faces test, while the nonparanoids were 
unaffected on this measure. 

These last two results can be seen as con- 
sistent with clinical observations of differences 


TABLE 3 


MEAN CHANGE SCORES FOR EXPERIMENTAL AND CONTROL SESSIONS FOR THE CHRONIC PARANOID 
SCHIZOPHRENICS AND THE CHRONIC NONPARANOID SCHIOZPHRENICS 


Experi- 
mental Control Difference 
session session between 
; change change change 
Variable score score scores SDaitt t ip 
Word-association 
pathological thinking 
Paranoid 3.39 72 2.67 5.81 1.95 05 
Nonparanoid 1.41 1.00 41 3.81 45 ns 
Story-recall pathological 
thinking 
Paranoid 1.89 72 1.17 3.83 1.30 12 
Nonparanoid 2.06 41 1.65 3.95 1.72 06 
Total pathological thinking 
Paranoid 1.15 .29 86 1.57 2.32 025 
Nonparanoid 1.08 40 68 1.67 1.68 06 
Story-recall accuracy 
Paranoid 89 0 89 11.14 34 ns 
Nonparanoid —2.76 1.12 3.88 8.74 1.83 05 
Faces test projection of 
aggression 
Paranoid 1.72 —1.22 2.94 6.35 1.96 05 
Nonparanoid 82 65 17 6.31 12 ns 


Note.—Chronic paranoid schizophrenics, N = 18; chronic nonparanoid schizophrenics, N = 17. 
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between paranoid and nonparanoid schizo- 
phrenics. With regard to accuracy of story 
recall, which can be taken as a measure of 
intellectual efficiency, it is almost a clinical 
commonplace that paranoid schizophrenics 
often show little if any general intellectual 
impairment, while most other kinds of schizo- 
phrenics usually manifest noteworthy impair- 
ment. The research of Mason (1956) and 
Roe and Shakow (1942) bear this out. How- 
ever, while this statement refers to the effect 
on intellectual functioning of the original 
schizophrenic “break,” the story-recall data 
bear on transitory shifts in functioning oc- 
curring well after the schizophrenic condition 
has been established. Our results indicate that 
as a consequence of aggressive stimulation, 
the already impaired functioning of the non- 
paranoids dips even lower, while for the 
generally well-retained paranoids, intellectual 
efficiency remains unchanged. 

The difference in accuracy of story recall 
that was found for the paranoid and non- 
paranoid schizophrenics can be related to 
one of the differences in cognitive style that 
Silverman (1964) has noted as distinguish- 
ing patients of these two subtypes. He re- 
ported that extensive scanning is generally 
found in paranoid and minimal scanning in 
nonparanoid schizophrenics. Extensive scan- 
ning would be expected to be of help in a 
recall task of the sort presented to the Ss. 
Could it have been this disposition that en- 
abled the paranoids to maintain their memory 
functioning, despite the fact that (as Table 3 
shows) in other respects the aggressive stimu- 
lation increased ego disturbance? 

A post hoc finding provides some support 
for this possibility. The experimenter had 
recorded the amount of time used by each S 
in recalling his baseline and critical stories. 
Thus change scores could be computed for 
this variable in the same way as they were 
for the variables listed in Table 3. A test of 
the difference between these change scores 
revealed that for the paranoids there was a 
significantly greater increase in time spent on 
story recall in the experimental session 
(t= 2.84; p=.001, two-tailed test). The 
nonparanoids showed no such increase after 
aggressive stimulation, in fact, they mani- 
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fested a decrease in time spent, though this 
was not significant (¢= 1.35). Considering 
Silverman’s (1964) conclusion that when 
threatened, “extensive scanners appear to scan 
to an even greater (than usual) degree 
[p. 361],” the performance of the paranoids 
on story recall can be understood in the 
following way. The threat of aggression trig- 
gered by the experimental manipulation led 
them to intensify their scanning behavior. 
This was reflected in the increased time they 
spent trying to remember the stories and en- 
abled them to maintain their usual level of 
recall accuracy. 

Finally, let us turn to the finding that the 
paranoids manifested a significant increase in 
their attribution of unpleasantness to the 
faces that they rated after aggressive stimula- 
tion. This result can be taken as an experi- 
mental demonstration of the paranoid schizo- 
phrenic’s propensity for handling upsurges of 
aggression through projection. The absence of 
such a finding for the nonparanoids would 
reflect their lesser tendency to project when 
aggressive drive derivatives are stirred up. 

In conclusion, the results of this experiment 
support the following contentions: (@) The 
disruptive effects that drive stimulation can 
have on ego functioning can be studied 
through the subliminal presentation of a 
drive-related stimulus; (b) the response of 
both paranoid and nonparanoid schizo- 
phrenics to aggressive stimulation is increased 
pathological thinking; (c) nonparanoid 
schizophrenics also react to this stimulation 
with a loss of intellectual efficiency and 
paranoid schizophrenics with an increased 
projection of aggression. 
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OBESITY, LEVEL OF ASPIRATION, AND RORSCHACH 
AND TAT MEASURES OF ORAL DEPENDENCE * 
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Obese Israelis and controls were administered the Rorschach, TAT, and a 
level-of-aspiration task. The obese Ss scored higher on oral dependence than 
the control Ss on both the Rorschach (p=.01) and TAT (p= .02). There 
were no differences on either test for oral sadism, although the Rorschach 
subcategories of overwhelming figures and burdens and TAT themes of depri- 
vation discriminated significantly between the groups. The most sensitive oral 
dependence subcategories were nurturers, supplicants and food organs (Ror- 
schach), and themes of passivity, optimism, and helplessness (TAT). When 
only those Ss who scored above or below the median on both tests were con- 
sidered, predictions regarding obesity were 90% accurate. Contrary to expec- 
tation, the obese Ss set level-of-aspiration goals more realistically than the 


controls, 


Obesity is a frequent problem in our 
culture. Dorfman (1946) estimated 20 years 
ago that there were 30,000,000 obese people 
in the United States and that they carried 
with them 125,000 tons of surplus weight. 
The language of eating and overeating has 
become part of every day speech. Such ex- 
pressions as “egg in your beer,” “eat, drink, 
and be merry,” “the way to a man’s heart is 
through his stomach,” and “an army travels 
on its stomach” and the rotund images of 
Falstaff and Santa Claus indicate the extent 
to which food and eating have influenced 
symbols and speech. 

As might be expected, psychoanalysis has 
had a great deal to say about obesity, a 
mouthful, in fact, to use an oral cliché. 
Kaplan and Kaplan (1957) have critically 
reviewed this literature. In general, the ana- 
lytic position is that food represents the 
mother’s love, and by overeating the obese 
person can indulge in the unconscious wish 
to experience the infant’s satisfactions in 
taking in food. By regressing to earlier satis- 
factions, adult frustrations are thereby 
avoided. Bruch (1961) has presented a some- 
what different version of the origins of 
obesity. When the mother is unable to relate 


1 This study was conducted while the first author 
held a Fulbright Fellowship at the Hebrew Univer- 
sity, Jerusalem. 


effectively toward the child, the child may 
develop “a falsified foundation for his per- 
ceiving his own bodily states [p. 473].” If 
the lack of bodily awareness is severe, the 
child may feel that “he neither owns his body 
nor is in control of its functions. Patients 
suffering from eating compulsions will say: ‘Jt 
just happens to me—J do not want to eat’ 
[p. 475].” 

Whatever the theoretical differences regard- 
ing the etiology of obesity, there is consider- 
able agreement about the personality charac- 
teristics of the obese, although there is no 
way of telling whether these characteristics 
precede or follow the onset of obesity. The 
obese are frequently described as dependent 
(Bruch, 1961; Schopbach & Matthews, 1945), 
immature (Bruch, 1961; Shovron & Richard- 
son, 1949), passive (Nicholson, 1946; Schop- 
bach & Matthews, 1945), and helpless 
(Bruch, 1961; Bruch, 1964b), These charac- 
teristics—dependence, immaturity, passivity, 
helplessness—constitute a good description of 
oral dependence, as defined by psychoanaly- 
sis. However, oral dependence is only one 
aspect of the oral stage. The other aspect, 
oral sadism, is characterized by biting, chew- 
ing, and ambivalence (Blum, 1953). These 
latter characteristics have never been used to 
describe the obese. On the basis of psycho- 
analytic theory, therefore, it would be ex- 
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pected that obese people should show oral- 
dependent signs, but not oral-sadistic signs. 

At least one other characteristic has been 
observed in the obese. Bruch (1964a) has 
commented on the preoccupation of the obese 
with unrealistic goals and unusual achieve- 
ment. If a child is seen as having to com- 
pensate for the failure and disappointments 
of his parents, as is frequently the case with 
the obese, he may develop exaggerated con- 
ceptions of his own abilities. Summers 
(1957), working with obese children in a 
level-of-aspiration task, has concluded that 
these children have difficulties in making real- 
istic estimates of their skills. 

In the present investigation, two general 
hypotheses were tested: (a) Obese subjects 
(Ss) would show more oral-dependent signs 
than control Ss, but would not show more 
oral sadism; and (b) obese Ss would show 
greater disturbances in goal setting, as meas- 
ured by a level-of-aspiration task, than would 
controls. A third goal of this study was to 
obtain data on the intertest consistency of 
the oral-dependent and oral-sadistic measures. 


METHOD 


The Ss were outpatients being treated at the 
Metabolic Clinic at the Hadassah Hospital, Jeru- 
salem. Control and obese Ss were seen by the same 
medical staff, Unfortunately, body weights are not 


TABLE 1 


DESCRIPTIONS OF EXPERIMENTAL AND 
CONTROL SUBJECTS 


Experi- 
mental Control 
Variable group group 
Sex 
Female 18 16 
Male 2 2 
Age 
Age range 15-64 16-68 
Mean age 39 41 
Education (no, of years) 
0 or don’t know 5 4 
1-8 5 3 
9-12 8 8 
12+ 2 3 
Birthplace 
Israel 6 6 
Europe & North America 13 9 
Moslem countries 1 3 
Mother's birthplace 
Europe 18 11 
Israel 1 2 
Moslem countries 1 an eS 
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available to describe either the experimental or con- 
trol Ss, but the differences were marked and obvious 
even to the casual observer. Twenty Ss were being 
treated for extreme obesity; of these, 6 came to a 
special treatment center in an annex of the hospital 
where they ate all their meals, while the other 14 
were seen at intervals of 14 weeks, depending on 
the stage of treatment, at the Metabolic Clinic, 
The 18 control Ss were all seen at the Metabolic 
Clinic for a variety of medical reasons, but none 
was obese. The obese Ss were matched with the 
control Ss for age, sex, education, birthplace, and 
birthplace of the mothers, Table 1 presents the 
relevant data for the obese and control Ss, 


Tests 


Each S was tested individually by the second 
author, who was then a graduate student at the 
Hebrew University, either in Hebrew or English, 
depending on the preference of the S, with the 
instructions to him that this was part of the 
regular clinic routine. The following tests were ad- 
ministered, all in the same testing session, given in 
the order listed: Rorschach, four cards of the TAT 
(2, 7 GF, 13B, and 18 GF), and a level-of-aspira- 
tion task. Only the free association was requested 
in the Rorschach; no inquiry was attempted. 

The level of aspiration was assessed by showing 
each S eight 3 X 5 cards on which dots were drawn 
in the form of designs. The S was given 3 seconds 
in which to estimate the number of dots on each 
card. The test was introduced to the S as a measure 
of accuracy of the perception of the human eye. 
After the S had made his estimate of the number of 
dots on a card he was given fictitious information 
in percentiles regarding the accuracy of his per- 
formance, for example, “You did better than 75% 
of people in your age group.” Then he was asked 
two questions: how well he expected to perform on 
the next trial and how well he hoped to perform 
on the next trial. The following information was 
given S after his estimate on each card: 


After Card 1: “You did better than 75% of people 
in your age group.” 

After Card 2: “You did better than 35% of people 
in your age group.” 

After Card 3: “You did better than 40% of people 
in your age group.” 

After Card 4: You did better than 30% of people 
in your age group.” 

After Card 5: S was told he did 20% better than 
his estimate after Card 4. 

After Card 6: S was told he did 15% worse than 
his estimate after Card 5. 

After Card 7: S was told he did 15% worse than 
his estimate after Card 6. 


Scoring Procedures 


The definitions of oral dependence and oral sadism 
were adapted from Schafer (1954). Simple scoring 
manuals were prepared for both the Rorschach and 
TAT by defining the categories and listing examples 
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TABLE 2 
SUMMARY OF ATTEMPTS TO RELATE TEST SCORES TO OBESITY 


Predictor 


Results Direction 


Rorschach % 
(N Exp. = 20; N Con. = 16) 
Oral dependence 
Oral sadism 


Rorschach (absolute scores) 


Oral dependence 
Oral sadism 


TAT (absolute scores) 


(N Exp. = 20; N Con. = 15) 
Oral dependence 
Oral sadism 


Level of aspiration 


(N Exp. = 19; N Con. = 12) 
D scores 
Total number of shifts 
Unusual shifts 
Shift range 
Raising estimate after success 
Raising estimate after failure 


XL = 9.3787", Obese higher 


x? = 7.70*** Obese higher 


x? = 6.08** Obese higher 


x2 = 6.640" 
t = 2.43* 


Obese higher 
Controls higher 


under each subcategory. The following categories 
were used for each test, 

Rorschach. Oral dependence: food sources; food 
providers; passive food receivers; food organs; 
supplicants; nurturers; gifts, gift givers, and good 
luck symbols. Oral sadism: devourers; the act of 
fighting or killing; overwhelming figures; figures 
which deprive; deprivation; faulty oral capacity; 
oral assault; burdens. 

TAT. Oral dependence: passive dependent themes; 
asking for help or receiving help; presence of 
parental figures or nurturers; food sources; food 
organs; food providers or food objects; belief in 
good luck, magic, or optimistic story endings; help- 
lessness, loneliness, or depression; mouth behavior. 
Oral sadism: depriving others or being deprived; 
devouring figures and aggression; overwhelming 
figures; burdens; oral assault. 

For the Rorschach, each response which met one 
of these criteria was scored once; the maximum 
score could be no higher than any S’s total number 
of responses. The TAT was scored for each of the 
nine oral-dependent themes and five oral-sadistic 
themes outlined above; the maximum oral dependent 
score for the series of four TAT cards was 36, and 
the maximum oral sadistic score was 20. A TAT 
sentence, “Her mother never gave her anything and 
they argued,” would be scored once for oral de- 
pendence (use of the word “mother”) and twice 
for oral sadism (deprivation and oral assault). It is 
important to note that for neither test were such 


food responses as “meat” and “two people eating” 
included in the Ss’ scores; to have scored such food 
responses would have contaminated the definition of 
orality with the everyday behavior of the obese Ss. 

The level-of-aspiration task was scored in several 
ways, using various measures of the D score (the 
difference between the performance level, as reported 
by the experimenter, and S’s estimate of how well 
he expected to do on the next trial). Altogether six 
measures of the D score were used. In addition, the 
total number of shifts from a previous score, the 
number of unusual shifts (reporting a higher esti- 
mate after failure or a lower estimate after success), 
and shift range were also employed, Finally, re- 
sponses following two designs, No. 2 and No. 5, 
were analyzed separately. Design 2 was considered a 
failure experience, since all Ss were told they only 
did as well as 35% of the population, and Design 5 
was considered a successful experience, since the Ss 
were told they scored 20% higher than they had 
previously estimated. f 

For each of these tests, some Ss either refused 
to cooperate, or performed so poorly their responses 
could not be scored. Every S attempted the Ror- 
schach test, but two Ss, both controls, gave fewer 
than 10 responses; they were not included in the 
Rorschach analysis. The same two Ss and one more 
control refused to take the TAT. Five Ss claimed 
inability to complete the level-of-aspiration task. 
The numbers of Ss who gave usable protocols are 
found in Table 2. 
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RESULTS 
Reliability 

The first two authors independently scored 
the Rorschach and TAT for oral dependence 
and oral sadism, using protocols which had 
all identfying information removed. There 
were 741 responses on the 36 usable Ror- 
schach protocols; of this number there were 
51 disagreements on presence or absence of 
an oral percept, producing 93% agreement in 
this dimension. Of the 133 oral responses, 
there were 9 disagreements on categoriz- 
ing the response as oral-dependent or oral- 
sadistic, producing 93% agreement within the 
dimension. 

Similarly, the TAT was scored blindly. 
Since the unit was the story, rather than dis- 
crete responses, percentage of agreement 
could not be calculated. Instead, a Pearson r 
was computed between the two raters’ scores. 
For oral dependence, this correlation was 83, 
and for oral sadism the correlation was .66, 
both correlations significantly greater than 
zero. Disagreements for both the Rorschach 
and TAT were resolved by conference without 
knowledge of S’s group. 

A summary of experimental results is found 
in Table 2. All probability levels are based 
on two-tailed tests. 


Rorschach 


The mean number of Rorschach responses 
for the obese Ss was 21.8 and for the control 
Ss 17.9; this difference was not statistically 
significant. Two separate analyses were then 
performed, one using percentage of oral re- 
sponses of the total number of responses 
given, and the second using the absolute num- 
ber of oral responses given by each S. Using 
percentage of oral responses, it was found 
that the obese Ss gave more oral-dependent 
responses (x? = 9.37, p = .01) than the con- 
trols, but did not give more oral-sadistic re- 
sponses. Using the absolute number of re- 
sponses, rather than percentages, the same 
result was found: The obese Ss gave more 
oral-dependent responses than the control Ss 
(x? = 7.70, #=.01), but not more oral- 
sadistic responses, 

Inspection of the Ss’ responses showed that 
not all subcategories of oral dependence and 
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oral sadism were equally used. In the oral 
dependence category, the obese Ss gave a 
total of 21 responses to the subcategories of 
nurturers, supplicants, and food organs, while 
the control Ss gave only two responses in 
these subcategories. The significant chi-square 
reported above is due to the nonrandom dis- 
tribution of responses in these three sub- 
categories. Inspection of the oral-sadistic re- 
sponses showed that the “devourer” subcate- 
gory, which we had defined to include such 
popular, perhaps nonagressive, percepts as 
bear and wolf, was responsible for the lack 
of significance between the two groups. With 
the “devourer” subcategory eliminated, a non- 
random pattern of response between the two 
groups occurred: The obese gave 22 responses 
to the subcategories of overwhelming figures 
(witch, giant) and burdens (camel, elephant, 
oxen), while the controls gave only 3 such 
responses. Table 3 presents the specific break- 
down of responses within the most sensitive 
subcategories. 


TABLE 3 


RESPONSES OF OBESE AND CONTROL SUBJECTS TO 
SELECTED SUBCATEGORIES OF ORAL DEPENDENCE 
AND ORAL SapisM 


Number Number 
of responses of responses 
of Obese of Control 
Subcategories Ss Ss 
Oral dependence 
Rorschach test 
Nurturers 7 1 
Supplicants 6 1 
Food organs 8 0 
TAT 
Passive, dependent 12 4 
Good luck, optimism 22 11 
Helplessness, loneliness, 32 11 
depression 
Oral sadism 
Rorschach test 
Overwhelming figures 14 1 
Burdens 8 2 
TAT 
Depriving others or 34 14 
being deprived 
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The mean number of words used by the 
obese Ss was 255.00, while the control Ss 
used 204.33 words. This difference was not 
significant, Using the absolute number of 
themes as the predictor, it was found 
that the obese Ss gave more oral-dependent 
responses than the controls (x? = 6.08, 
b = 02), but did not give more oral-sadistic 
responses. It was not possible to use the 
relative number of oral responses as a pre- 
dictor, since our measure consisted of both 
words and themes, so that there was no way 
to obtain a meaningful denominator from 
which to determine the ratio. 

Inspection of the oral dependence responses 
showed that for the most part large differ- 
ences between the obese and control Ss ap- 
peared in each subcategory; however, these 
differences were especially pronounced in the 
subcategories of passive, dependent themes, 
belief in good luck and optimism themes, 
and themes emphasizing helplessness, loneli- 
ness, and depression. For the oral-sadisitic 
responses, the only subcategory showing sig- 
nificant differences concerned themes of de- 
priving others or being deprived; the obese 
Ss gave 34 such themes and the control Ss 
only 14 (x? = 4.06, p = .05). 


Generality of the Oral Measures 


Since the Rorschach and the TAT have 
different stimulus characteristics, it was of 
interest to learn the extent to which oral 
responses would be elicited from both tests. 
One measure of this was obtained by com- 
puting a Pearson r between scores for each 
S on the TAT and Rorschach. The correla- 
tion for oral dependence was .58 and for oral 
sadism it was .51; both correlations are 
significantly greater than zero beyond the .01 
level. 

A second indication of the generality of 
the oral responses was obtained by examining 
the number of Ss who scored consistently 
high or low in orality on both tests. Table 4 
presents these data. It can be seen that of 
the 21 Ss who scored consistently in either 
the bottom half or top half of both tests on 
oral dependence, all the control Ss were below 
the median and 9 of the 11 obese Ss were 
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TABLE 4 


DISTRIBUTION OF OBESE AND CONTROL Supyects WHo 
SCORED CONSISTENTLY HIGH OR LOW ON 
BOTH RORSCHACH AND TAT 


Above median on Below median on 


both Rorschach both Rorschach 
Subject and TAT and TAT 
Obese 9 2 
Control 0 10 


above the median. A Fisher exact probability 
test (Siegel, 1956) computed on these data 
was significant beyond the .001 level. Thus, 
with this particular sample, knowing an S 
scored either high or low on oral dependence 
on both tests gave 90% correct prediction 
(19 hits and 2 misses) for obese versus non- 
obese status.* However, accuracy declines ap- 
preciably if one starts with the 20 obese and 
15 control Ss who took both tests and at- 
tempts to predict consistency of oral-depend- 
ence scores. In this case, only 9 of the 20 
obese Ss are predicted correctly and 10 of 
the 15 controls, for a hit ratio of 54%, with 
most of the misses due to Ss who scored high 
on one test and low on the other. 

A similar analysis was carried out with the 
oral-sadistic category, but no significant 
results were obtained. 


Level of Aspiration 


None of the six measures of the D score 
proved significant, nor did the total number 
of shifts, the number of unusual shifts, or 
shift range discriminate between the two 
groups. The groups did differ in their re- 
sponses to the question, “What score do you 
expect to get next time?” after failure 
(Design 2) and success (Design 5). After 
failure, the control Ss raised their scores more 
than the obese (f= 2.43, p=.05)*; after 
success the obese raised their expected 
achievement more than the controls (x? = 6.64, 
b = 01). The question, “What score do you 


2 As impressive as this statistic is, there is even 
a better method for predicting obesity: The psy- 
chologist can look at the subject. 

3 The ¢ test was used here rather than x? because 
all but three of the obese Ss gave the same response, 
removing the possibility of determining a median 
score. 
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hope to get next time?” did not discriminate 
between the groups on any measure. Most 
Ss stated they hoped to be at the 100th 
percentile for each trial and did not vary this 
response. 


Discussion 


Several conclusions can be drawn from this 
study. Both the Rorschach and TAT can be 
scored quickly and reliably for oral depend- 
ence and oral sadism. Clinical observations 
of the obese patient were confirmed: The 
obese gave many more oral-dependent re- 
sponses on both projective tests than the 
controls. However, oral dependence was not 
a homogeneous category, and such aspects as 
nurturer, supplicants, food organs (on the 
Rorschach) and passivity, dependence, help- 
lessness, loneliness, depression, and belief in 
good luck and optimism (on the TAT) 
showed especially strong relationships to 
obesity. For the most part, oral sadism was 
not related to obesity, as expected, but again, 
this category was not homogeneous. On the 
Rorschach the oral-sadistic subcategories of 
overwhelming figures and burdens, and on the 
TAT the subcategory of being deprived or 
depriving others, were related to obesity. If 
these subcategory relationships are found in 
other studies, either the traditional definitions 
of oral dependence and oral sadism need to 
be reformulated or the psychological theory 
of obesity needs to be modified. 

Behavior of obese Ss on the level of aspira- 
tion was not clarified by this study. It was 
hypothesized that the obese would show 
greater difficulty in making realistic estimates 
of their achievement than the controls. The 
two groups did differ in their responses to 
success and failure, but if anything, the obese 
showed better ability to adjust to failure than 
the controls. This question clearly needs 
further investigation. 

Accuracy in prediction was aided materially 
by using only those Ss who performed con- 
sistently high or low in oral dependence. 
While the correlations of both oral Measures 
between the two tests were significantly 
greater than zero, they only account for about 
30% of the variance, Orality is evidently 
not a characteristic of behavior which will be 
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manifested in every situation nor elicited in 
every projective test, but is partly a function 
of the total stimulus configuration. 

This study took place in a culture in which 
oral needs are emphasized. Not only is the 
Jewish family said to be traditionally matri- 
archal, but many Israelis have known ex- 
treme, intensive food deprivation, either 
through concentration-camp experience or 
through the extensive food shortages during 
the 1948 War of Independence. However, as 
noted earlier, finding a correlation between 
physical characteristics and personality char- 
acteristics does not demonstrate that the lat- 
ter determines the former; a case could be 
made for the assertion that outsize body 
weight, which could originate for a variety of 
nonpersonality reasons, determines depend- 
ent, passive behavior. And there is always the 
possibility that an uncontrolled factor, oper- 
ating independently, influenced the test re- 
sponses. There is ample evidence that situa- 
tional forces get reflected in test responses 
(Masling, 1960). In the present experiment, 
the obese Ss were all on severe diets; the 
control Ss were not. The increase in oral- 
dependent responses may have resulted, at 
least in part, from the food deprivation 
experienced by the obese Ss. 

It would be of great interest to determine 
whether the close relationship found in Is- 
raelis between obesity and orality will also 
be found in other cultures, It would also be 
of interest to compare both obese Ss who are 
dieting and obese Ss who are not dieting with 
dieting and nondieting controls, 
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EFFECT OF NURTURANCE ON WIVES’ APPRAISALS OF 
THEIR MARITAL SATISFACTION AND THE DEGREE 


OF THEIR HUSBANDS’ APHASIA * 


JOAN BUXBAUM 2 
Institute of Rehabilitation Medicine, New York University Medical Center 


In the present study, the main hypothesis tested was that there is a relation- 
ship between wives’ degree of nurturance (need to give affection and care) 
and their perception of the severity of their husbands’ speech disabilities. It 
was further hypothesized that there would be a positive relationship between 
wives’ nurturant needs and their reports of marital satisfaction. In addition, 
it was hypothesized that wives who were high in nurturance would report 
fulfilling affectional roles and would report sharing activities with their hus- 
bands more often than wives who were low in nurturance. Speech ratings 
were made on the Functional Communication Profile. Marital satisfaction was 
measured by a marital roles and attitudes questionnaire, a need-satisfaction 
questionnaire and a marital happiness scale. The major hypotheses were con- 
firmed with exception of the fulfillment of affectional roles. Here no signifi- 
cant differences were found between high-nurturant and low-nurturant groups. 


This study deals with the relationship be- 
tween a wife’s needs and the effect that they 
have on her judgment of her husband’s dis- 
ability. Nurturance was the chief need inves- 
tigated. The subjects used in the study were 
wives of men who had aphasia (acquired im- 
pairment of verbal behavior affecting any or 
all language processes) as the result of a 
stroke (cerebral vascular accident). Since the 
ability to communicate verbally is consid- 
ered central in interpersonal relationships, it 
appeared to be of both practical and theo- 
retical importance to study the effects in a 
marriage of the loss of verbal communica- 
tion in one of the partners. Such loss could 
hopefully shed light on the dynamics of 
marital relations when they are placed under 
severe strain. This type of disability not only 
requires an adjustment to a traumatic oc- 
currence and its resulting consequences in 
everyday life patterns, but it also presents 
problems in adjustment to a change in the 
physical condition and personality of the 


1 This study was supported in part under the des- 
ignation by the Department of Health, Education 
and Welfare, Vocational Rehabilitation Adminis- 
tration to the Institute of Rehabilitation Medicine, 
New York University Medical Center, 

2 This article is based upon a doctoral dissertation 
submitted to Teachers College, Columbia Univer- 
sity, Grateful acknowledgment is extended to Don- 
ald E. Super, Laurance F, Shaffer, and Roger A, 
Myers. 


husband as a result of brain damage. Aphasia 
afforded the opportunity to investigate a loss 
of functioning in an actual life situation 
which could not readily be reproduced in a 
laboratory. A meaningful situation in which 
to study the influence of needs on personal 
judgment was, therefore, available. Another 
important factor was that the degree of speech 
functioning is clearly measurable. It is neither 
so ambiguous as to be completely open to 
subjective impressions, nor so clearly deline- 
ated that personal attitudes could not influ- 
ence judgment of the degree of impairment. 

Most of the studies in the literature in- 
volving a family’s reaction to a disability 
have dealt with parents and their disabled 
children (Coughlin, 1941; Farber, 1960; 
Wortis, 1954). Other studies concerned them- 
selves with investigations of the effects of a 
disabled person in relation to family patterns 
and attitudes (Deutsch & Goldston, 1960, 
1962). A search of the literature has revealed 
almost nothing relating to a marriage part- 
ner’s reaction to the sudden onset of a spouse’s 
disability. It has been overlooked in reha- 
bilitation because most often the concentra- 
tion has been directed to the patient and his 
adjustment to his life and to his attitudes 
towards his family and other persons. Clinical 
observations of wives’ adjustments to their 
husbands’ conditions by the author and other 
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staff members in a rehabilitation center 
showed wide differences, regardless of the 
severity of disability. Wives’ reactions seemed 
to have an important effect on their hus- 
bands’ rehabilitation. For this reason, the 
present study was undertaken with nurtur- 
ance as the need variable. 

Murray’s psychology of needs (1938) pro- 
vided a theoretical framework because of its 
analysis of needs in relation to environmental 
forces and their effects upon perception. 
Murray speaks of needs as, “inner energies 
of which the personality may be wholly un- 
aware,” but which “seem to influence per- 
ception, apperception and intellection [p. 8].” 

The observations of Katz, Cohen, and Cas- 
tiglione (1963) on need satisfaction as an im- 
portant component of marital adjustment and 
of positive perceptions of a spouse add sup- 
port to the use of Murray’s theory of needs. 
In another study Katz, Glucksberg, and 
Krauss (1960) found need satisfaction in 
marriage, for both husbands and wives, to be 
positively related to wives’ scores on nur- 
turance and succorance. A third study by 
Katz, Goldston, Cohen, and Strucker (1963) 
indicated that wives whose needs were met 
by their husbands described their husbands 
more favorably than did wives whose needs 
were not being met by their husbands. 

Research in the field of marital adjustment 
has concentrated on interpersonal percep- 
tion, complementarity of needs, and role ful- 
fillment. The conclusion that Tharp (1963) 
makes in a review of studies of marriage pat- 
terning is that, “marital satisfaction is a 
function of the satisfaction of needs and/or 
expectations specific to husband and wife 
roles [p. 115].” 

In the present study, the general hypothe- 
sis tested was that wives who are high in 
nurturance would rate their husbands’ speech 
disabilities as less severe than wives who are 
low in nurturance. It was further hypothe- 
sized that wives who are high in nurturance 
would show higher scores on measures of 
marital satisfaction than wives who are low 
in nurturance. In addition, it was hypothe- 
sized that wives who are high in nurturance 
would report fulfilling affectional roles and 
sharing activities with their husbands more 


often than would wives who are low in nur- 
turance. 


MetHop 
Subjects 


The subjects (Ss) were 47 middle-class, white fe- 
males. They were wives of men who had sustained 
a stroke at least 6 months prior to the study. All 
the Ss’ husbands were treated and diagnosed at the 
Institute of Rehabilitation Medicine as right hemi- 
plegic and aphasic. 


Measurement of Speech Impairment 


The Ss’ judgments of the severity of their hus- 
bands’ speech impairments were measured by the 
Functional Communication Profile (FCP), a rating 
scale regularly employed by the Speech Department 
at the Institute of Rehabilitation Medicine (Taylor, 
1963). This scale consists of an estimate of a pa- 
tient’s ability to perform 50 common language func- 
tions of everyday life. Such items as saying greetings 
and speaking whole sentences are included, In a 
factor analysis (Taylor, 1965) of the FCP, five 
factors emerged: oral movement, speaking, under- 
standing (auditory comprehension), reading, and 
“other,” which includes such items as time orienta- 
tion and writing one’s own name. The FCP has 
been used effectively at the Institute of Rehabilita- 
tion Medicine in rating more than a thousand adult 
aphasics, Interrater reliability of speech therapists at 
the Institute of Rehabilitation Medicine was .95. 

The FCP was designed to encompass as much in- 
formation as possible concerning a patient’s speech 
in the most simple and apparent manner, The form 
makes no reference to symptomatology, nor to di- 
agnostic labels. It was, therefore, considered to be a 
nontechnical, readily comprehensible rating scale, 
suitable for a lay person’s use. The FCP provides 
ratings for each type of speech behavior on a 9- 
point continuum ranging from normal to poor, Di- 
rections were given to each S$ with illustrative ex- 
planations for each item and descriptions of func- 
tioning from normal to poor. For each husband, 
scores for the five factors revealed in the factor 
analysis were obtained separately from the wife and 
from the speech therapist, who was familiar with 
the S's husband. A total score for each S’s husband 
was obtained by summing the scores on the five fac- 
tors. The difference between the speech therapist's 
total score and the wife’s total score (total differ- 
ence score) for the same individual constituted the 
data used from the FCP. 


Measurement of Personal Needs 


Nurturance, in addition to eight other needs, was 
assessed by a personal preference schedule, adapted 
by Katz, Cohen, and Castiglione (1963) from the 
Edwards Personal Preference Schedule (Edwards, 
1959). This schedule was selected because it spe- 
cifically dealt with nurturance needs in relation to 
interactions with the spouse. For example, in the 
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Katz version a statement of nurturance was, “I like 
to show a great deal of affection towards my 
spouse.” Katz, Cohen, and Castiglione (1963) found 
an internal consistency of .64 for wives’ nurturant 
scores, 


Measurement of Need Satisfaction (NS) 


NS was measured on a scale also devised by 
Katz, Cohen, and Castiglione (1963) from the de- 
scription of needs used on the Edwards Personal 
Preference Schedule. This measure was based on Ss’ 
ratings of the degree to which their husbands tended 
to satisfy or to thwart various needs of the S. This 
scale has been found by Katz, Goldston, Cohen, and 
Strucker (1963) to distinguish high and low satis- 
faction groups significantly. The S’s ratings on all 
needs were summed to yield a total satisfaction 
score, 


Measurements of Marital Roles, Attitudes, 
and Activities 


In addition to formal rating procedures, a ques- 
tionnaire was devised to obtain the following infor- 
mation: social and biographical data on husband 
and wife, marital roles within the family structure, 
degree of liking of role and role changes, extent of 
family activities before and after disability, and rat- 
ings on degree of marital happiness before and after 
disability, 

Social and biographical data. Two separate sheets 
were used to obtain demographic material from the 
wife, one for herself and the other for her husband. 

Degree of liking or disliking of role and role 
changes. The same 13 items used to assess role and 
role changes were also used to assess liking or dis- 
liking of particular items. They were answered twice, 
once to rate liking of an item in a retrospective re- 
port before disability and again to rate the item 
after disability. The Kuder-Richardson test of re- 
liability was computed for the before and after 
scales, The before scale showed a reliability of .67 
and the after scale .71. The Ss were required to rate 
their feelings on a 13-point scale for each item, 
Scores were determined individually and then to- 
taled for each item for before and after disability. 

Family activities, The Ss answered this section of 
the questionnaire by reporting their recollection of 
activities in which they participated before the onset 
of their husbands’ disabilities, They answered this 
section again to report changes in their activities 
since the onset of the disabilities. The Ss were re- 
quired to rate how much time was spent in activi- 
ties together with their spouses or separately from 
them, These items included social and family gath- 
erings, recreation, and organizational activities. Scores 
were obtained by categorizing items by “together” 
or “separate” activities, This was done separately 
for the high-nurturant group and the low-nurturant 
group. p 

Marital happiness. The marital happiness scale was 
adapted from Locke and Wallace’s (1959) study. 
Locke and Wallace obtained a reliability of .90 on 
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split-half testing for all the items of their marital 
adjustment scale. The Ss were required to check on 
a 13-point scale the degree of their marital happi- 
ness, as recalled before their husbands’ disabilities 
as well as after. The Ss’ scores were determined by 
measuring the point of the check mark on the scale. 


RESULTS 


No significant differences were found be- 
tween high-nurturant and  low-nurturant 
groups on demographic variables. The hy- 
pothesis that wives’ nurturant needs were re- 
lated to a positive bias in the perception of 
their husbands’ speech impairments was sup- 
ported. Table 1 shows the correlation of nur- 
turance with the total difference on speech 
scores from the FCP. The higher a wife’s 
nurturance score, the relatively less impaired 
did she judge her husband’s speech (p < .05, 
one-tailed test). 

The hypothesis that high nurturance was 
associated with measures of marital satisfac- 
tion was also confirmed, as shown in Table 1. 
These results confirm the hypothesis that 
nurturance is positively related to need satis- 
faction (p < .01, one-tailed test). The results 
also confirm the hypothesis that nurturance is 
significantly related to high estimates of mari- 
tal happiness, both before and after disability 
($ < 01, and < .05, respectively, one-tailed 
tests). 

Chi-square revealed no significant differ- 
ences between high-nurturant and low-nur- 
turant groups on fulfillment of affectional 
roles. Chi-square also confirmed the hypothe- 
sis that high-nurturant wives more often 
shared activities with their husbands than did 
low-nurturant wives, both before their hus- 
bands’ disabilities (p < .05) and after their 
husbands’ disabilities (p < .02). 


TABLE 1 


CORRELATIONS OF Major VARIABLES WITH 
NURTURANCE FOR TOTAL GROUP 


Variable Nurturance 
Total speech difference score r = .24* 
Need satisfaction r = .39** 
Marital happiness-before disability r= 35" 
Marital happiness-after disability r= 33 


Note.—N = 47. 
* p <.05, one-tailed test. 
**> <.01, one-tailed test. 


os 


IS 
? 


Wives’ APPRAISALS OF APHASIC HUSBANDS 243 


DISCUSSION 


According to findings of this study, nurtur- 
ance is an important variable in a wife’s 
perception of her husband and her reaction to 
him, not only in relation to experiencing a 
more satisfying marriage, but also in relation 
to having a more positive perception of him 
even after a disability has changed him. That 
is, the “satisfied” wife might feel greater 
motivation to “satisfy” her husband. To use 
Murray’s (1938) framework, the environ- 
mental press of the husband’s disability has 
not materially changed the nurturant wife’s 
perception of her husband. She still sees him 
as providing a positive cathexis for her and 
is still attracted to him as a person in need 
of care and nurturing. 

It may be possible that husbands respond 
more to nurturant wives. High-nurturant 
wives are, therefore, not necessarily distort- 
ing when they indicate that they feel their 
husbands are communicating more, because 
they may actually elicit more communication 
from them. It is also possible that high-nur- 
turant women are so empathic to their hus- 
bands that they can anticipate responses more 
readily, and so communication between them 
may be less impaired than speech therapists’ 
ratings of them on the FCP may imply. 

Several conclusions with regard to reha- 
bilitation seem to follow from this study. High 
nurturance seems to be an asset to wives 
whose husbands have sustained a disability, 
whereas, wives who are low in nurturance 
seem to have a more difficult time in coping 
with the changes due to their husbands’ im- 
pairments. If wives low in nurturance could 
be discovered early in their husbands’ treat- 
ment, perhaps they could be helped to un- 
derstand their reactions, so that they might 
have less need to devalue their husbands be- 
cause of their changed status. 

It may well be that the factor of nurtur- 
ance in wives is equally important when deal- 
ing with other disabilities or other life stresses 
that radically alter a husband’s role within 
the family. This study points to the impor- 
tance of taking into account the pattern of 
need satisfactions of family members in the 


attempt to obtain maximum functioning from 
the disabled person. 

The results of this study point to nurtur- 
ance as a fruitful area for further investiga- 
tions, both in terms of the theoretical ramifi- 
cations and the practical applications to the 
rehabilitation process. 
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EFFECTS OF VERBAL MEDIATORS ON A NONVISUAL 
FORMBOARD TASK* 


ROBERT M. KNIGHTS anv MARGARET R. OLVER 


University of Western Ontario 


Studies of the performance of brain-damaged Ss on formboard tasks have 
indicated that the test is sensitive to brain lesions. The test is known to 
require tactual, kinesthetic, and spatial abilities, but the influence of verbal 
mediators has not been thoroughly investigated. The present study investi- 
gated the influence of labeling the blocks use in a formboard task on speed 
of performance. 20 freshmen performed on formboards which contained 
familiar and unfamiliar shaped blocks in a counter-balanced design, with and 
without the benefit of labels for the blocks, The learning of names for the 
unfamiliar blocks significantly improved the speed of performance. The im- 
plications of the results for the use of verbal mediators on formboard tasks 


are discussed. 


Since Halstead’s (1947) use of a modified 
Sequin formboard for the diagnosis of brain 
damage, a number of neuropsychologists have 
included a formboard task in their test bat- 
teries. Performance on formboards has been 
shown to be sensitive to cerebral lesions and 
particularly to those of the posterior portion 
of the brain (Reitan, 1959, 1964; Teuber, 
1962; Teuber & Weinstein, 1954; Weinstein, 
1962), The test is considered to assess tac- 
tile, kinesthetic, and spatial abilities, and Hal- 
stead’s (1947) factor analysis found his form- 
board, test loaded on a factor exemplified by 
a loss of ability to recognize objects (ag- 
nosias) and a loss of ability to execute acts 
(apraxias). 

Some speech functions are impaired by le- 
sions in the parietal and posterior portions of 
the brain, particularly in the dominant hemi- 
sphere (Reitan, 1960; Teuber, 1962). Per- 
formance on the formboard is sensitive to 
parietal area dysfunction, and it is reasonable 
to assume that test performance may be af- 
fected by the level of verbal ability as well 
as tactual and spatial skills. 

There are two principal methods to deter- 
mine the influence of language abilities or 
vetbal mediators on formboard performance. 
‘One is to compare the performance of brain- 
damaged individuals who have specific lan- 
guage deficits with those who reveal no verbal 
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deficits. This type of study was done by 
Reitan (1960), who administered the form- 
board to groups of dysphasic, nondysphasic, 
and normal control adults. He found that the 
control group did much better on the form- 
board than the two brain-damaged groups, 
but that the dysphasic group was significantly 
poorer than the nondysphasic group on speed 
of formboard performance. This result sug- 
gests that verbal abilities are important for 
this psychomotor task. 

A second method for assessing the influ- 
ence of verbal mediators on formboard per- 
formance is to manipulate the verbal labeling 
associated with the blocks by comparing per- 
formance on the tasks when familiar and un- 
familiar shaped blocks are used. The present 
study investigates the effects of the associa- 
tion of verbal labels (names) to seen forms on 
a subsequent nonvisual tactual-motor per- 
formance task. Visual experience is permitted 
in the training period, but not during the 
testing period, in which the subject (S) com- 
pletes a formboard task while blindfolded. 


METHOD 
Subjects 


The Ss were 8 males and 12 females drawn from 
classes in introductory psychology at the University 
of Western Ontario. All Ss were naive with respect 
to the general purpose of the study, Five Ss were 
discarded because of failure to return for the second 
testing. 


Apparatus and Procedure 


Two modified Sequin-Goddard six-hole formboards 
were used, one containing familiar shapes (star, 
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cross, etc.) and the other containing unfamiliar 
shapes. The unfamiliar shapes consisted of asym- 
metrical variations equated in size to the familiar 
shapes and matched for the number of sides. Each 
block was assigned a name suggested by the shape 
(angel, butterfly, etc.). The test was administered as 
suggested by Halstead (1947). The formboard was 
sloped at 70° and the blindfolded S given three trials 
in which he fitted the blocks in the holes, first with 
the dominant hand, then with the nondominant 
hand, and then with both hands. 

The procedure consisted of three parts: (a) pre- 
labeling trials, (b) a label-learning period, and (c) 
postlabeling trials, In the prelabeling trials the 20 Ss 
performed on both the familiar and unfamiliar 
formboards without any specific instructions re- 
lated to labeling the blocks. The order of presenta- 
tion of the two formboards was counterbalanced 
across Ss. The S’s dominant hand was guided 
quickly around the perimeter of the formboard and 
over the blocks on the table in order to familiarize 
him with the location of the board and blocks. He 
was instructed to pick up the blocks one at a time 
and place each block in its appropriate hole as 
quickly as possible, using only the dominant hand. 
The S then performed the task with his nondominant 
hand and, finally, with both hands. 

The label-learning period occurred following the 
prelabeling performance with both formboards. The 
Ss were split into “Label” and “No-Label” condi- 
tions with 10 Ss in each group. In the Label Condi- 
tion Ss were shown each block and informed of its 
name, The six blocks in both the familiar and un- 
familiar sets were left exposed and were reviewed by the 
S until their names were known, This took one trial 
for the familiar shapes, since the Ss knew these, and 
a maximum of three trials for the unfamiliar shapes. 
In the No-Label Condition the blocks were placed 
in front of the S, and he was informed that he 
could observe the blocks he had been using. The 
observation period was 3 minutes and comparable to 
the maximum time required in the label-learning 
period. 

The postlabeling trials occurred 3 days later. All 
Ss were again shown the blocks and those in the 
Label Condition reviewed the names prior to per- 
forming with both the familiar and unfamiliar 
shapes. Those in the No-Label Condition again 
viewed both sets of blocks. Following this review 
period, the Ss performed the three trials—the domi- 
nant hand, the nondominant hand, and both hands— 
on both formboards. The order of familiar and unfa- 
miliar formboards was again counter-balanced. The 
Ss in the postlabeling trials composed two groups of 
10: Ss each: the Label Group which performed with 
both types of formboards and the No-Label Group 
which performed with both types of formboards. 


RESULTS 


Figure 1 shows the speed of performance 
over the three trials during the pre- and post- 
labeling periods for the Label and No-Label 
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SPEED IN MINUTES 


POST LABELLING 


PRELABELLING 


Fic, 1. The mean time scores of the Label and 
No-Label Groups for the three prelabeling and post- 
labeling trials with the unfamiliar and familiar 
shapes, 


Groups performing with the familiar and un- 
familiar blocks, The Ss in the groups using 
unfamiliar blocks took longer to place them 
in their appropriate holes than the groups 
using familiar blocks. Figure 1 also shows 
that during the postlabeling period the Ss who 
learned names for the blocks performed most 
rapidly. 

Three analyses of variance were performed 
on the data: one on the prelabeling time 
scores, one on the postlabeling time scores, 
and one on the change scores obtained by sub- 
tracting each of the three postlabeling trials 
from each of the prelabeling trials. 

The prelabeling scores were analyzed with 
a three-way analysis of variance. Although 
the Label versus No-Label Conditions did not 
occur until after the training period, this 
variable was included in order to assure that 
the performance of the two groups was simi- 
lar prior to the label-training period. The 
between-Ss variable was Names (label versus 
no-label) and the within-Ss variables were 
Shape (unfamiliar versus familiar) and Tri- 
als (dominant, nondominant, and both 
hands), The Names groups did not differ sig- 
nificantly. A significant difference was found 
between the groups using the unfamiliar and 
familiar shaped blocks (F = 59.85; df =1/ 
18; p<.001). The expected improvement 
over the three trials (F = 11.25; df = 2/36; 
pb <.01) was significant. 

The analysis of variance of the postlabeling 
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scores was a similar 2 X 2 X 3 design with 
Names as the between-Ss variable and Shape 
and Trial as the within-Ss variables. No sig- 
nificant main effect of the presence or ab- 
sence of labels was found, but again the Shape 
of block (F = 23.54; df = 1/18; p< .001) 
and the Trials effect (F = 24.05; df=2/ 
36; p< .001) were significant. No other 
effects were significant. 

The most important analysis, for testing 
the differential effect of the experimental con- 
ditions, was the analysis of variance of the 
change scores obtained by comparing the pre- 
labeling and postlabeling scores. Again this 
analysis was a 2 (Names) X 2 (Shapes) X 
3 (Trials) design. The main effect of Names 
was not significant, indicating no difference 
as a function of labels or no-labels when the 
familiar and unfamiliar block groups were 
combined. The Shape of block was signifi- 
cant (F= 19.95; df=1/18; p<.01), in- 
dicating more rapid performance with the 
familiar shapes. The significant interaction of 
Names X Shapes (F = 5.38; df = 1/18; p 
<.05) indicates that the labeling of the 
blocks was associated with a greater change 
in speed of performance for the groups using 
unfamiliar blocks, The mean change score 
from the pre- to postlabeling trials for the 
Label and No-Label Groups with the unfa- 
miliar blocks was also compared with a ¢ 
test using an estimate of error variance based 
on the between-groups mean squares obtained 
from the overall analysis of variance, The 
significant difference (¢ = 2.08; $ < .05) in- 
dicates that the performance of the Label 
Group improved significantly over that of the 
No-Label Group. 

Comparison by ¢ tests of the performance 
of males and females revealed that the males 
were consistently but honsignificantly faster 
than the females, In the prelabeling period 
with the familiar blocks, the total mean time 
was 1.19 minutes for the males and 1.32 
minutes for the females. With the unfamiliar 
blocks the total mean time for the three pre- 
labeling trials was 3.05 minutes for the males 
and 3.75 minutes for the females. 


Discusston 


The Tesults indicate that Performance on 
the formboard with unfamiliar shapes im- 
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proves significantly when labels are given to 
the blocks. This suggests that verbal medi- 
ation is a factor in formboard performance. 
The results are consistent with those of Rei- 
tan (1960), who found that brain-damaged 
adults with dysphasia performed less ade- 
quately than patients without dysphasic symp- 
toms. Apparently, the formboard test re- 
quires many types of abilities, verbal as well 
as tactual and spatial, and this may be the 
reason that the test is sensitive to lesions of 
various types and in various areas of the 
brain. 

The finding that verbal mediators are one 
of the factors involved in formboard perform- 
ance was also demonstrated by Atkinson 
(1966). This study was designed to examine 
the significance of spatial variables in form- 
board performance, but did include groups 
with relevant and irrelevant verbal pretrain- 
ing. He found that college freshmen who had 
verbal pretraining that was relevant to the 
subsequent formboard performance performed 
significantly more rapidly than the group 
with irrelevant pretraining. The research lit- 
erature of a nonclinical nature on the effects 
of verbal mediators on motor responses indi- 
cates the importance of the type of verbal 
pretraining used and the specific nature of the 
type of response required (Vanderplas, 1963). 

In a study similar to the present experi- 
ment, Knights, Hyman, and Atkinson (1966) 
manipulated the verbal labels associated with 
blocks and found no effects on speed of form- 
board performance. Although the Ss were 
children, the difference in findings is consid- 
ered to be due to differences in procedure. In 
the Knights et al. (1966) study the Ss were 
not permitted to see the shapes, while in the 
present study they did observe the blocks. 
This difference in results may indicate the 
importance of the relationship between visual 
and verbal processes. The effectiveness of 
verbal mediators in facilitating psychomotor 
performance may be a function of the pres- 
ence or absence of the use of the visual mo- 
dality. : 

Another method of examining the signifi- 
cance of verbal mediators in formboard per- 
formance is by comparing groups with differ- 
ent intellectual levels. It is assumed that re- 
tardates have less adequate verbal skills than 
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normals and, hence, would perform less rap- 
idly on psychomotor tasks. Although this 
hypothesis has been supported (Matthews & 
Reitan, 1963), it is difficult to distinguish 
whether the poorer performance is due to less 
adequate verbal mediation or to poorer tac- 
tual and psychomotor skills (Denny, 1964). 

From the studies reviewed it is concluded 
that verbal mediators are one of the many 
variables involved in formboard performance. 
The fact that other abilities (problem-soly- 
ing, tactual, kinesthetic, and spatial) are also 
involved is likely to contribute to the general 
sensitivity of formboard performance to brain 
lesions. 
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INFERENCE OF ATTITUDES FROM NONVERBAL 
COMMUNICATION IN TWO CHANNELS * 


ALBERT MEHRABIAN ann SUSAN R. FERRIS 
University of California, Los Angeles 


In the present study 3 degrees of attitude (i.e., positive, neutral, and negative) 
in facial expression were each combined with 3 degrees of attitude communi- 
cated vocally. The vocal communications of attitude were superimposed on 
a neutral word. In preparing the 2-component communications, the components 
were selected so that the degree of positive attitude communicated facially 
was equivalent to that communicated vocally—that is, the independent effects 
of the 2 components were comparable. It was found that attitudes inferred 
from combined facial-vocal communications are a linear function of the 
attitudes communicated in each component, with the facial component 
receiving approximately 3/2 the weight received by the vocal component. 
Implications of the findings for more general attitude-communication problems 


are discussed. 


While there are many studies of nonverbal 
attitude or feeling communication in single 
channels (e.g., reviews by Davitz, 1964 or 
Mahl & Schulze, 1964), investigators are only 
beginning to explore simultaneously trans- 
mitted feelings or attitudes in two or more 
channels. Gates’ (1927) investigation of 
single-channel decoding of facial and vocal 
stimuli is relevant to the present study. She 
found that children are more accurate in 
their judgments of facial compared to vocal 
expressions of feeling. Unfortunately, her 
method only allows a tentative conclusion 
that discrimination of feeling on the basis of 
facial cues is easier than discrimination of 
feeling on the basis of vocal cues. There is, 
however, some corroboration of Gates’ find- 
ings in a study by Levitt (1964). Com- 
municators were filmed as they attempted to 
communicate six emotions facially and vo- 
cally, using neutral verbal materials. The de- 
coding of facial and vocal stimuli in combina- 
tion was only as accurate as the decoding of 
facial stimuli alone, and both conditions were 
more accurate than the decoding of vocal 
stimuli alone. This finding can be interpreted 
to indicate that in a two-channel facial-vocal 
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communication of emotion, the facial channel 
contributes more to the decoding of the total 
message than the vocal channel. 

There is one other study in which the 
characteristics of two-channel communications 
of emotion have been explored. Williams and 
Sundene (1965) used the semantic differential 
method (Osgood, Suci, & Tannenbaum, 1957) 
to obtain judgments of the same emotions 
communicated facially, vocally, and in facial- 
vocal combinations. All three modes of com- 
munication of emotion were found to be 
recognized in terms of the three factors of 
general evaluation, social control, and activity. 

It should be noted that none of the fore- 
going studies investigated two-channel com- 
munications in which the emotion communi- 
cated in the facial expression was inconsistent 
with that communicated vocally. Despite the 
paucity of experimental studies of decoding 
of multichannel communications of feeling 
or attitude by any particular population 
(e.g., children or adults), there is some theo- 
retical consideration of the effects of such 
communications. Bateson, Jackson, Haley, 
and Weakland (1956) proposed a “double 
bind” theory of schizophrenia. They consider 
the maladaptive responses of schizophrenics 
to be a consequence of their being the fre- 
quent recipients of inconsistent attitude com- 
munications. The double-bind communication 
can be defined as typically consisting of two 
or more inconsistent attitude messages which 
are assumed to elicit incompatible responses 
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from the addressee. For example, a mother 
asks her son to come over and kiss her while 
she nonverbally communicates disinterest in 
what he is requested to do. The child is 
assumed to be left with the difficult task of 
responding to either the verbal or the non- 
verbal component, with the knowledge that 
response to the former will elicit a rebuff and 
response to the latter will elicit indignation. 
The recipients of frequent double-bind mes- 
sages are assumed to learn to respond with 
their own double-bind messages. In the ex- 
ample considered, the child may respond with, 
“I can’t come because my leg hurts,” or “I 
can’t come because Trap is holding me,” the 
hurt leg and Trap being figments of his 
imagination. 

While double-bind communications are as- 
sumed to lead to the development of mal- 
adaptive patterns of interpersonal function- 
ing, Haley (1963) also conceptualized most 
psychotherapeutic processes as being inter- 
pretable within a beneficial double-bind para- 
digm. Haley’s thesis is that applications of 
the beneficial double bind serve the function 
of successfully eliminating the secondary gain 
which is associated with a symptomatic be- 
havior and therefore eliminating the behavior. 

The above assumptions about the change- 
inducing properties of inconsistent communi- 
cations require clarification through investiga- 
tion of the ways in which multichannel at- 
titude communications are decoded. Since the 
quantification of degree of consistency or in- 
consistency between communications in two 
channels is only possible if the two com- 
munications can be scaled along a common 
dimension, the general evaluation dimension 
obtained in the Williams and Sundene (1965) 
study seems appropriate. Mehrabian and 
Wiener (1967) pursued these considerations 
by investigating the decoding of two-channel 
vocal-verbal communications in which three 
degrees of attitude (i.e., positive, neutral, and 
negative) in the vocal component were each 
combined with three degrees of attitude in the 
verbal component (i.e., meanings of words). 
They found that when vocal communication 
of attitude is inconsistent with verbal com- 
munication of attitude, normal addressees re- 
spond to the two-channel communication by 
subordinating the verbal component to the 


vocal component. If, for example, the word 
“scram” is said in a tone of voice which 
is independently judged as communicating 
positive attitude towards the addressee, the 
consensual interpretation of the total com- 
munication is positive. 

The present study was designed to investi- 
gate the decoding of inconsistent and con- 
sistent communications of attitude in facial 
and vocal channels. Three degrees of attitude 
(i.e., positive, neutral, and negative) com- 
municated in facial expressions were each 
combined with three degrees of attitude com- 
municated vocally. In accordance with Gates’ 
and Levitt’s findings, it was expected that the 
decoding of a consistent facial-vocal com- 
munication would yield a judgment equiva- 
lent to that obtained from the decoding of 
the facial component only—that is, the facial 
component would be dominant. Furthermore, 
since Mehrabian and Wiener’s (1967) study 
indicated that the dominant component in a 
two-channel communication determines the 
meaning of inconsistent communications, it 
was expected that the decoding of an incon- 
sistent facial-vocal communication would 
yield a judgment equivalent to that obtained 
from the decoding of the facial component 
only. It was therefore hypothesized that judg- 
ments of attitude, on the basis of consistent 
and inconsistent pairings of facial and vocal 
attitude communications, would yield a main 
effect due to variations in the facial com- 
ponent and no effect due to variations in the 
vocal component or its interaction with the 
facial component. 


METHOD 


Subjects. A group of 25 subjects (Ss) was used 
in the preliminary selection of a neutral word. A 
second group of 17 Ss was used to assess the inde- 
pendent effects of facial and vocal communications. 
A third group of 20 Ss was used to obtain the 
combined effects of facial-vocal communications. 
All 62 Ss were female University of California 
undergraduates who participated in the study in 
partial fulfillment of introductory psychology course 
requirements. 

Materials. For the selection of a neutral attitude- 
communicating word, 25 Ss were asked to judge the 
attitude of a speaker towards her addressee when 
saying each of 15 words, The 15 words were each 
presented in written form and Ss recorded their 
judgments of attitude on a 9-point scale designated 
“dike very much,” +4 and “dislike very much,” —4 
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at its poles. The word “maybe” was selected as an 
appropriate neutral verbal carrier of vocal communi- 
cations, since it was rated .28 on the attitude scale 
with a standard deviation of .72. 

For the selection of vocal communications of three 
degrees of attitude, three female speakers were 
instructed to vary their tone of voice while saying 
the word “maybe,” so as to communicate like, neu- 
trality, and dislike towards an imagined addressee. 
Each speaker was instructed to say the word 
“maybe” twice in the same way while her com- 
munications were being recorded on magnetic tape. 
The 18 items, consisting of two instances each of 
positive, neutral, and negative vocal attitude com- 
munications obtained from the three speakers, were 
presented to 17 female Ss with the following written 
instructions: 


The purpose of this study is to find out how well 
people can judge the feelings of others. In this 
part, you will hear a recording on which the word 
“maybe” is spoken in different tones of voice, You 
are to imagine that the speaker is saying this word 
to another person, the addressee. For each tone, 
indicate on the scale what you think the speaker's 
attitude is towards the addressee. 


A modified form of the semantic differential instruc- 
tions (Osgood, Suci, & Tannenbaum, 1957, pp. 80- 
85) was used to direct Ss’ use of an attitude scale 
designated “like,” +3 and “dislike,” —3 at its poles, 
The 18 items were presented in a different random 
order to each S, The positioning of positive and 
negative poles of the scale was counter-balanced and 
was random in the 18-item sequence. 

The facial communications of three degrees of 
attitude were selected in a similar manner. Photo- 
graphs of three female models were taken as they 
attempted to use facial expressions to communicate 
like, neutrality, and dislike towards another person. 
The photographs were 34 X 4% inch black and white 


TABLE 1 


INDEPENDENT Errects OF VOCAL AND Factat Com- 
MUNICATIONS: DEGREE OF POSITIVE ATTITUDE 
INFERRED FROM THREE KINDS oF VocaL 
ATTITUDE COMMUNICATION BY Two 
SPEAKERS AND THREE KINDS or 
FACIAL Artrrupe COMMUNICA- 

TION BY Two MODELS 


Inferred Attitude Scores Corresponding 
to Communications Considered 
Communi- 
cator Positive Neutral Negative 
M|SD| u |sp| m |sp 
Speaker 1 | 2.41 0.79 0.06 0.82 | —2.29 0.77 
Speaker 2 | 2.35 0.70 | — 1.12. 1.11 | —2.18 0.73 
Modeli | 2.12 0.70 0.35 0.61 | —2.65 1.00 
Model2 | 2.35 0.61 —0.24 1.15 | —2.53 0.62 
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prints of head only against a neutral background, 
Eighteen items, consisting of two photographs for 
each degree of attitude communicated by each of 
the three models, were presented to the same 17 Ss 
who judged the vocal communications. The instruc- 
tions for recording judgments of the facial com- 
munications of attitude were analogous to those 
used with the vocal communications. The Ss were 
randomly assigned so that eight of them judged 
facial communications prior to judging vocal com- 
munications and nine made their judgments in 
reverse order. 

On the basis of Ss’ judgments of the vocal and 
facial communications, three vocal communications 
(i.e. positive, neutral, and negative), obtained from 
each of two speakers, and three facial communica- 
tions, obtained from each of two models, were 
selected. As the data in Table 1 indicate, the facial 
attitude communications of a given value (eg, 
positive) were selected to match the vocal attitude 
communications of the same value. Standard devia- 
tions of judgments as well as their means were 
approximately matched. A 3 AttitudeX4 Com- 
municator X 17 Subject analysis of variance of the 
data summarized in Table 1 indicated a significant 
effect due to the Attitude factor (F = 333.47, 
df = 2/32; MS.=1.12, p < .001), no significant ef- 
fect due to the Communicator factor (F < 1, df= 
3/48; MS. = 37, p> .25), and no significant Attitude 
X Communicator effect (F= 1.05, df = 6/96; MS. = 
-70, p >.25). Thus, the independent effects of all 
vocal communications of attitude are comparable 
to the independent effects of all facial communica- 
tions of attitude within each of the three levels of 
attitude investigated, 

Design. The three vocal attitude communications 
of each speaker were each paired with the three 
facial attitude communications of each model. There- 
fore, there were 36 experimental conditions, con- 
sisting of 3 Vocal Attitude x3 Facial Attitude X 
2 Speaker X 2 Model interactions. All 36 conditions 
were administered to each of 20 Ss, thus yielding a 
3% 3X 2X 2 factorial design with repeated measures 
on all factors, 

Procedure. The experiment was individually ad- 
ministered to each S. The written instructions pre- 
sented to the Ss were: 


The purpose of this study is to find out how 
well people can judge the feelings of others, You 
will be shown photographs of different facial 
expressions and at the same time you will hear 2 
recording of the word “maybe” spoken in different 
tones of voice. You are to imagine that the person 
you see and hear (A) is looking at and talking 
to another person (B). For each presentation, 
indicate on the scale what you think A’s attitude 
is towards B. 


A second form of instructions was identical to 
those presented above with the exception that refer- 
ences to facial expressions and seeing, and tones 
of voice and hearing, were made in reverse order. 
The Ss were randomly assigned so that half received 
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the first form and half received the second form 
of instructions. A modified form of the semantic 
differential instructions was again used to direct Ss’ 
use of an attitude scale designated “like,” +3 and 
“dislike,” —3 at its poles. 

The 36 two-channel communications were pre- 
sented in a different random order to each of 20 
Ss. In each experimental condition, the vocal and 
facial components of the communication were pre- 
sented simultaneously, so that Ss heard the vocal 
component only while seeing the facial component 
and vice versa. 


RESULTS 


Each S recorded 36 responses, correspond- 
ing to all possible combinations of three 
facial communications of each of two models 
with three vocal communications of each of 
two speakers. The responses, which had a 
possible range of +3 to —3, were analyzed in 
a 3X3X2X2 factorial design with re- 
peated measures on all factors, The analysis 
indicated a significant effect due to Facial At- 
titude (F = 233.14, df = 2/38; MS. = 2.37, 
p < .001) and a significant effect due to Vocal 
Attitude (F = 77.49, df = 2/38; MS, = 3.33, 
p < .001). None of the other main or inter- 
action effects attained the .05 level of signifi- 
cance. The Facial Attitude X Vocal Attitude 
interactions with MS, = 2.21 are summarized 
in Table 2. The Newman-Keuls method 
(Winer, 1962) yielded significant differences 
at the .01 level for all comparisons within 
each level of both factors. 

The Facial Attitude factor accounted for 
41.4% of the total sum of squares, whereas 
the Vocal Attitude factor accounted for 19.3% 
of the total sum of squares. Furthermore, 
the effects of the facial and vocal components 
were strongly linear. The linear trend ac- 
counted for 97% of the effect due to the facial 
component and 99% of the effect due to the 
vocal component. Moreover, the combined ef- 
fect of the facial and vocal components was 
a weighted sum of their independent effects, 
since there was no significant interaction be- 
tween them. The following regression equa- 
tion summarizes the relative contributions of 
facial and vocal components to interpreta- 
tions of combined facial-vocal attitude 


communications: 
Ar = 1.50 Ap + 1.03 Ay 


TABLE 2 


EFFECTS oF Two-CHANNEL FACIAL-VOCAL COMMUNI- 
CATIONS : DEGREE OF INFERRED POSITIVE ATTITUDE 
CORRESPONDING TO THE FACIAL ATTITUDE 
X VOCAL ATTITUDE INTERACTIONS 


Facial component 


Vocal component 


Positive Neutral Negative 
Positive 2.45 1.31 —0.91 
Neutral 1.33 0.50 —1.62 
Negative 0.20 —1.07 —2.47 


Ar represents attitude inferred on a —3 to 
+3 scale from the two-channel communica- 
tions. Ap represents attitude communicated 
in the facial component alone and is assigned 
values of +1, 0, and —1 for positive, neutral, 
and negative attitude, respectively. Similarly, 
Ay represents attitude communicated in the 
vocal component alone. The .95 confidence 
interval for the coefficient of Ap is 1.32 to 
1.68, while that of the coefficient of Ay is 
-79 to 1.28. The absence of overlap between 
the two intervals indicated that the effect due 
to the facial component was significantly 
greater than that due to the vocal component. 


Discussion 


The hypothesis of the present study was 
only partially supported. A main effect due 
to variations in the facial component and 
no effect due to variations in the vocal com- 
ponent or its interaction with the facial 
component had been expected. The results of 
the study indicate that the facial and vocal 
components do not interact and that the facial 
component has a stronger effect than the 
vocal component. However, in contrast to the 
hypothesis, the effect due to the vocal com- 
ponent is also significant. Thus, the results of 
the study can be summarized as follows: 
Attitudes inferred from two-channel facial- 
vocal attitude communications are a linear 
function of the attitude communicated in 
each component, with the facial component 
receiving approximately 3/2 the weight 
received by the vocal component. 

The above results were obtained from a 
sample of normal adult female Ss who were 
communicators and addressees. However, it is 
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likely that the linear model for two-channel 
communications of attitude obtained for 
female communicator-addressee combinations 
has broader applicability. For instance, the 
model (with, perhaps, slightly different rela- 
tive weights for the facial and vocal com- 
ponents) may be applicable to same- and 
different-sex communicator-addressee pairs of 
various ages. 

One interesting implication of the linear 
model with positive coefficients relates to 
redundant multichannel communications of 
attitude. The model indicates that the effect 
of redundancy (i.e., consistent attitude com- 
munication in two or more channels) is to 
intensify the attitude communicated in any 
one of the component channels. Thus, push- 
ing a child away while turning away from 
him is assumed to communicate a more 
negative attitude toward the child than only 
pushing him away or only turning away from 
him. Similarly, holding and kissing a child 
is assumed to communicate a more positive 
attitude towards the child than only holding 
or only kissing the child. 

A final comment is required to integrate 
the implications of the findings of the present 
study with the findings of the Mehrabian and 
Wiener study (1967). It is suggested that 
the combined effect of simultaneous verbal, 
vocal, and facial attitude communications is 
a weighted sum of their independent effects— 
with the coefficients of .07, 38, and .55, 
respectively. Analytic procedures outlined by 
Anderson (1962, 1964) can presently be em- 
ployed to test this proposed weighted-sum 
model for any single decoder. In view of 
these extrapolations of experimental findings 
from the decoding of multichannel incon- 
sistent communications, the assumptions 
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underlying the effects of double-bind com- 
munications can be questioned. Further 
experimentation with schizophrenics or chil- 
dren as addressees is needed to clarify 
the pathology-inducing or behavior-modifying 
effects of inconsistent communications of 
attitude. 
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INSTRUMENTS 
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Two groups of experimenter-subjects administered Xerox copies of 5 Holtzman 
inkblots with the expectation of obtaining either a high or a low total number 
of responses. One group formulated their own expectations, the other was given 
expectations. The “giyen” group was subdivided on the basis of their scores 
on a test measuring the tendency to acquiesce to psychologists as prestigeful 
figures. Bias was found to be an especially strong phenomenon, independent 
of the source of the hypothesis (own or given), subjects’ prestige acquiescence, 
number of questions asked in inquiry, or experimenter-subjects’ knowledge 
of the purpose of the experiment. Some evidence was obtained suggesting that 
learning, perhaps in the form of verbal conditioning, might be a bias-mediating 


process. 


“Experimenter bias” refers to the effects 
that an experimenter’s expectations have on 
the data he obtains. Rosenthal and his co- 
workers (Rosenthal, 1963, 1964, 1966; Ro- 
senthal & Fode, 1963; Rosenthal & Lawson, 
1963) have found these effects with both ani- 
mal and human subjects. He hypothesizes 
that subtle and often unconscious manner- 
isms (gestures, tonal inflections, etc.) on the 
part of the experimenter may cue the sub- 
ject as to the appropriate course of action. 

Due to the highly complex interpersonal 
transaction characterizing the clinical situa- 
tion, one would expect these bias factors to 
play an even larger role here than in the 
more circumscribed laboratory setting. Evi- 
dence that clinicians influence their clients’ 
responses comes from many sources. Heller 
and Goldstein (1961) have demonstrated that 
clients behave in accordance with their ther- 
apists’ expectations, Masling (1960) has re- 
viewed studies identifying different sources of 
extratest influence on a client’s responses. He 
has also found that Rorschach examiners ex- 
pecting either human or animal responses 
tend to elicit these responses from their sub- 
jects (Masling, 1965). Other experiments 
have shown that different examiners tend to 
elicit significantly different kinds and num- 
bers of Rorschach determinants from their 


1 The authors wish to thank Robert Rosenthal for 
his helpful comments concerning portions of the 
final manuscript. Parts of the research were reported 
in a paper read at the 1966 Midwestern Psychologi- 
cal Association convention. 


clients, even when a standard client popula- 
tion and a standard inquiry are employed 
(Baughman, 1951; Gibby, 1952; Gibby, Mil- 
ler, & Walker, 1953; Guilford & Lacey, 1947). 

Masling’s (1965) study used only exam- 
iners who had been given an hypothesis, The 
present experiment compares this procedure 
with one in which examiners formulate their 
own hypotheses. Both methods have real-life 
bases in clinical and laboratory situations. 
In the clinic, hypotheses may emanate from 
either the prevailing orientation of the setting 
and the clinician’s superiors, or from the cli- 
nician’s own preconceptions. In a typical 
graduate school research setting, an investi- 
gator may either receive his hypotheses from 
a faculty member or formulate them on his 
own. Although Marcia (1961), investigating 
a model of this latter situation, failed to 
confirm his hypothesis that experimenters 
who formulated their own hypotheses would 
yield more bias than those to whom hypothe- 
ses were given, he expressed some reservations 
about the effectiveness of his “own” hypothe- 
sis condition. The present study deals with 
the testing-situation model and makes the 
prediction that more bias will be forth- 
coming from an “own-hypothesis” than from 
a “given-hypothesis” condition. In addition, 
within the given-hypothesis condition, it is 
predicted that those testers with a tendency 
to acquiesce to psychologists as_prestigeful 
figures will bias more than those without 
this tendency. 
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The four aims of this study were: (a) to 
replicate the previous findings of “tester 
bias”; (b) to test the hypothesis that testers 
who make their own hypotheses yield more 
bias than those to whom hypotheses are 
given; (c) to determine whether testers in 
the given condition who are high in prestige 
acquiescence yield more bias than those who 
are low in this characteristic; and (d) to 
investigate some parameters of bias mediation. 


MeErtHop 
Subjects 


Experimenter-subjects (E-Ss) were 36 volunteer 
students from an undergraduate course in experi- 
mental psychology at the State University of New 
York at Buffalo. Their subjects (Ss) were 53 
students from an introductory psychology class 
at the same university. 


Apparatus 


Basic apparatus consisted of achromatic Xerox 
copies of Holtzman inkblot cards: 25A, 34A, 19A, 
12A, and 27A. The Ss’ responses and reaction times 
for each card were recorded on separate sheets of 
a S-page booklet. Each E-S was also given a re- 
minder sheet summarizing the procedure and pro- 
viding a set of standardized instructions for the Ss 
he was to test. 


Experimental Groups 


E-Ss were divided into two groups: Own-Hypothe- 
sis and Given-Hypothesis, The Own-Hypothesis 
Group was split into high and low sections on the 
basis of a questionnaire soliciting their opinions as 
to whether the well-adjusted college students they 
were about to test would give a high or a low 
total number of responses to an inkblot test. To 
maximize each E-S’s investment in his hypothesis, 
he was required to give two reasons for his expecta- 
tion, a possible counter-argument, and his answer 
to this argument. Thirteen of the Own-Hypothesis 
Group expected a high number of Tesponses; six 
expected a low number of responses. 

The Given-Hypothesis Group was split in ap- 
proximately equal proportion to the Own-Hypothesis 
Group. Twelve of these E-Ss were told to expect 
a high number of responses from their Ss and five 
were told to expect a low number of responses, 

It will be noted that the Given-Hypothesis Group 
was not required to provide arguments and counter- 
arguments for the expectations they had received. 
To have done so would have violated a major 
purpose of the study: to construct and contrast 
naturalistic models of two common situations. It 
seemed a reasonable assumption that in everyday 
life those to whom hypotheses are given expend 
less cognitive effort than those who formulate their 
own hypotheses. This was the situation to be 
replicated. 
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Prestige Acquiescence 


Each E-S took a “prestige acquiescence” scale for 
psychologists. This newly constructed scale required 
each E-S to indicate the degree to which he was 
willing to accept each of 35 disputable statements, 
assuming that the statement had been made by a 
member of each of the following five professions: 
architect, county judge, psychologist, minister, and 
member of the board of directors of a large corpora- 
tion. According to a recent nationwide occupational 
prestige survey by Hodge, Siegel, and Rossi (1964), 
the first two professions are popularly rated directly 
above psychologist, and the latter two are rated 
directly below psychologist. 

Two items typical of the scale are: 


1. Be critical of all you do. It’s a good outlook 
on life. 
Member of board of directors of a 
large corporation 
County judge 
Psychologist 
——— Architect 
a Minister 
2. A liberal arts background should be manda- 
tory before any specialized education. 
County judge 
— Minister 
—— Member of board of directors of large 
corporation 
Psychologist 
— Architect 


For each debatable statement, the S indicated which 
professional source he would be most willing to 
believe by placing a 1 in front of it; the source 
he would acquiesce to least, he indicated with a 5. 
All others were assigned intermediate ranks, The 
Score used for the analysis of data was the sum 
of the ranks assigned to “psychologist.” The lower 
the sum of ranks, the higher the assumed prestige 
acquiescence to psychologists. 


Procedure 


Two weeks prior to the experiment, E-Ss were 
administered the Prestige-acquiescence scale by their 
regular laboratory instructor under the guise of a 
normal laboratory exercise. 

The experiment was run in a single day. The E-Ss 
in the Own Condition met in the morning as one 
group to discuss the Holtzman with the investigator. 
They were told that it was a newly devised tech- 
nique which would serve as a parallel form to the 
Rorschach. Their task was to aid in the collection 
of additional normative and validational data in 
those few areas where Holtzman results had so far 
not duplicated Rorschach results. One such area 
appeared to be the total number of responses ob- 
tained when testing a well-adjusted college popula- 
tion. Because of their rather unusual dual role as 
both experimenters and college students, they were 
told that it would be of some value to have not 
only the data they were about to collect, but also 
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some of their ideas about this problem area. They 
then filled out the questionnaires concerning their 
own expectations and provided their rationale. 
Upon completion of this, E-Ss were given the stand- 
ard instructions for administration of the blots and 
emphatically reminded to give a verbatim written 
account of both the Ss’ responses and of any ques- 
tions which they, the E-Ss, might ask. The E-Ss 
were assigned to rooms where they found the num- 
bered inkblot cards, an instruction sheet, and record- 
ing booklets. They then ran the Ss who were ran- 
domly assigned. When they finished testing, the E-Ss 
met again with the investigator and filled out a 
final questionnaire asking them to title the experi- 
ment and to describe what they felt it was “really 
about.” 

The high and low subgroups within the Given 
Condition met separately in the afternoon. The 
ensuing procedure was essentially the same as with 
the Own-Hypothesis Group, the basic exception 
being that these E-Ss were told that Holtzman 
results appeared to duplicate the Rorschach and 
that their task was simply to aid the investigator 
in obtaining more data, One group was informed 
that Rorschach norms indicated a high total number 
of responses with well-adjusted college students; 
the other group was led to expect a low total 
number of responses. The only other difference from 
the Own-Hypothesis Group was that these E-Ss, 
upon completion of testing, were asked the addi- 
tional question of whether or not they had had any 
preexperimental expectations regarding the total 
number of responses given by well-adjusted college 
students. 


RESULTS 


Establishment of Bias and Comparison of 
Own versus Given Conditions 


Both to determine the presence of bias and 
to test the hypothesis that bias is greater for 
the Own rather than for the Given Condition, 
the High Expectancy and Low Expectancy 
Groups were compared with each other under 
Own and Given Conditions. Data on which 


TABLE 1 


NuMBER OF RESPONSES OBTAINED BY HIGH AND Low 
EXPECTANCY GROUPS UNDER OWN AND 
GIVEN CONDITIONS 


Experimental group 
Expectancy Own Given 
N| M SD|N| M | SD 
High 13 | 18.81 | 6.88 | 12 | 18.71 | 7.44 
Low 6 | 11.83 | 2.79 | 5 | 11.60 | 1.53 
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TABLE 2 


ANALYsIS OF VARIANCE OF RESPONSES OBTAINED BY 
Hicu AND Low EXPECTANCY Groups 


Source af MS F 
Expectancy (A) 1 377.81 11.86** 
Experimental groups (B) 1 00 .00 
AXB 1 -03 .00 
Within 32 31,84 
Total 35 

sb <.01. 


these comparisons were based are presented 
in Table 1, and the analysis of variance is 
presented in Table 2. 

The significant F ratio between high and 
low expectancy confirms the presence of bias. 
The E-Ss who expected to receive a large 
total number of responses from their Ss did 
elicit significantly more responses than those 
who expected to receive a low total number 
of responses. 

Both the interaction term of the analysis 
of variance and individual ¢ tests between 
High and Low Expectancy Groups for Own 
and Given Conditions failed to indicate 
greater bias for the Own Condition. In fact, 
if anything, the Given Condition yielded 
slightly more bias. 


Prestige Acquiescence (PA) 


To test the hypothesis that E-Ss in the 
Given Condition who are high in PA yield 
more bias than those low in PA, the High 
Expectancy and Low Expectancy Groups 
within these conditions were compared. Anal- 
ysis of variance on these data indicates that 
bias was unaffected by prestige acquiesence 
(F = .03, df = 1/10, ns). 


Parameters of Bias 


Verbal inquiry. To check the possibility 
that the bias effects may have been due to 
E-Ss’ rate of questioning, the total number 
of questions asked by the High Expectancy 
and Low Expectancy Groups under the Own 
and Given Conditions were compared. The 
only significant finding here was an unex- 
pected interaction. E-Ss in the Given Condi- 
tion who expected a low number of re- 
sponses asked the most questions (F = 6.68, 
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df = 1/32, p < .05). However, a correlation 
between the overall number of questions 
asked and the total number of responses 
obtained showed these to be unrelated 
(r = 07, N = 36, ns). 

Learning effects. It has been suggested by 
Marcia (1961), Masling (1965), and others 
that some forms of learning, such as verbal 
conditioning, may mediate bias. However, 
Rosenthal, Fode, Vikan-Kline, and Persinger 
(1964) have not been able to demonstrate 
this. Since the data were obtained in sequen- 
tial fashion, they provide information about 
the accumulation of bias effects. A learning 
process would appear as an increasing func- 
tion on a curve. The mean total number of 
responses to each card obtained by the 
different experimental groups, as well as the 
groups combined, is shown in Figure 1. 

The High Expectancy Group, under both 
Own and Given Conditions, shows a learning- 
like curve. The results of a trend analysis 
(Winer, 1962, pp. 70-77) performed on these 
data are presented in Table 3. A similar 


Key: o—o Both, ¢--- Own, e—-eGiven 


d 2 3 4 5 


Fic. 1. Mean number of responses: per card obtained 
by all experimental groups. 
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TABLE 3 


ANALYSIS OF VARIANCE WITH TESTS FOR TREND ACROSS 
Five TRIALS ror HIGH EXPECTANCY E-Ss 


Variance * df MS F 
Source 

Trials 4 127.81 112 

Error 136 562.57 

Total 140 690.38 
Components 

Linear* 1/136 19.03** 

Quadratic? 1/136 4.93* 

Cubic* 1/136 3.10 


* Variance accounted for: 62%. 

b Variance accounted for: 16%. 

° Variance accounted for: 10%. 

*p <.05. 

*p <01. 
analysis for Low Expectancy E-Ss revealed 
no significantly consistent trend. These find- 
ings suggest that the High Expectancy E-Ss 
elicit an increasing number of responses from 
their Ss in a way consistent with learning. 

Bias across Ss. Some E-Ss tested two Ss. 
In order to determine whether or not an 
E-S’s tendency to bias increased with experi- 
ence, the mean total number of responses 
obtained from the first S was compared with 
that from the second S. These comparisons 
were not significant, indicating the constancy 
of bias across at least two Ss. 

Awareness of experimental manipulations. 
Two of the 17 E-Ss in the Given Condition 
and 9 of the 19 in the Own Condition were 
aware that it was a “bias” experiment 
(x? = 3.81, df=1, p< .06). The tendency 
for Own E-Ss to be more aware of the nature 
of the experiment was probably due to sus- 
picion engendered by the more complex 
operations they were required to perform. 
Dividing the Own E-Ss into High and Low 
Expectancy Groups, the mean total number 
of responses obtained by those suspecting 
a bias study was compared with the mean 
total number of responses obtained by those 
without this realization. The nonsignificant 
ts obtained indicated that awareness of the 
nature of the study had no effect on the 
amount of bias obtained. 


Discussion 


The present demonstration of tester bias in 
a projective test setting replicated Masling’s 


—— 
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TESTER Bras AND RESPONSE TO PROJECTIVE INSTRUMENTS 


(1965) findings. In addition, the present 
study indicated that, under usually existing 
conditions, bias occurs to the same extent 
whether the origin of the influencing hypothe- 
sis is internal or external, a finding similar 
to Marcia’s (1961). 

It is true that one might create rather 
artificial situations in which experimenters 
would make their own hypotheses and expend 
little effort on them, or experimenters would 
be given hypotheses and required to 
“agonize” over them, and find differences 
in amount of bias as a function of these 
“effort” manipulations. However, that would 
be an atypical circumstance and the focus of 
this study was on the naturally occurring 
situation, i 

The present finding of an equal tendency 
to bias across both Own and Given Condi- 
tions raises a valuable methodological point 
for future bias studies. The lengthy and dif- 
ficult task of requiring experimenters to 
formulate their own hypotheses, thereby in- 
suring their personal investment in them, 
seems unnecessary in order to produce bias. 
Experimental induction itself seems sufficient. 

The purpose of the prestige-acquiescence 
measure was to tap just one characteristic: 
Ss’ tendencies to agree with psychologists as 
opposed to other equally prestigious profes- 
sionals. In this case, the test itself—accept- 
ance of controversial statements if made by a 
psychologist—seemed the best behavioral cri- 
terion, hence, the authors’ assertion of some 
face validity for the prestige-acquiescence 
measure, 

The failure to find a relationship between 
prestige acquiescence and amount of bias 
may have been due to one of two factors: 
The bias phenomenon may be exceptionally 
vigorous, unaffected. by prestige acquiescence; 
or the measure of prestige acquiescence, in 
spite of its face validity, may be poor. An 
interesting check on the first hypothesis 
would be a study manipulating the perceived 
prestige of the hypothesis source. 

The lack of correlation between the number 
of responses obtained and the number of 
questions asked was an unexpected finding. 
It was thought that E-Ss expecting either a 
high or low total number of responses would 
ask questions accordingly. Since this was not 
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the case, the mediation of the bias effect 
posed a problem. 

The learning explanation suggested by 
Marcia, although empirically unsupported by 
Rosenthal to date, received some tentative 
confirmation in the form of the obtained 
learning-like curves. For all experimental 
conditions, the total number of responses ob- 
tained generally increased as the E-S pro- 
gressed from his presentation of the first card 
to the last card. However, the increase was 
significant only for the High Expectancy 
Groups. The form of the curves might be 
dictated by two factors: learning comple- 
menting practice effects for the High Expect- 
ancy Group and learning cancelling out prac- 
tice effects for the Low Expectancy Group. 
An experiment utilizing more trials would pro- 
vide further information on this somewhat 
speculative interpretation. 

Another explanation for the mediation of 
bias has been suggested by Rosenthal (1966). 
Rather than the Ss being conditioned trial by 
trial, the E-Ss may be conditioned subject by 
subject, each S constituting one trial for the 
E-S. Although the present study, with only 
two Ss per E-S at the most, was not de- 
signed to address itself to this problem, the 
failure to find more bias on the second than 
on the first trial casts doubt on the generality 
of some of Rosenthal’s earlier findings. 

The finding that E-Ss’ awareness of the 
intent of the experiment had no effect on 
the amount of bias yielded suggests that in 
future studies some leakage of information, 
whether by the investigator or by some per- 
ceptive E-S may not be grounds to discredit 
the experimental results, In fact, manipula- 
tion of the E-Ss’ awareness might prove an 
interesting independent variable. 

With the establishment of both experi- 
menter and tester bias as phenomena cutting 
across numerous situations, and, to some 
extent, superseding specific motivational con- 
ditions, the next step in this area of situa- 
tional demand research is the investigation 
of parameters of bias mediation. The learn- 
ing—verbal-conditioning model holds some 
promise as an explanatory concept, at least 
for some bias situations. Especially valuable 
would be an exploration of an S’s reactions to 
the experimenter and to the experimental 
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situation, perhaps utilizing the interview 
techniques developed for the more traditional 
verbal conditioning studies (Spielberger, 
1962). 
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IDENTIFICATION AND FEAR DECREASE? 


ALAN S. DEWOLFE 


Veterans Administration Hospital, Downey, Illinois 


Measures of identification and fear of tuberculosis were obtained from 54 
student nurses on tuberculosis assignment. An r of .50 (p < .001) was found 
between identification with staff models (who were assumed to show little 
fear) and fear decrease, with preexposure fear and identification with nursing 
held constant. This supports the hypothesis that fear decrease in a fear-pro- 
voking situation is related to identification with models showing little fear in 
the situation. The results were interpreted as indicating a reciprocally rein- 
forcing “snowball” effect in the relationship between fear decrease and identi- 


fication with staff models, 


The reduction or extinction of fear has 
been a source of considerable interest in a 
number of areas of psychology, including 
clinical (e.g., Lang & Lazovik, 1963; Wolpe, 
1958), learning theory (eg., Miller, 1959; 
Mowrer, 1960), and developmental and per- 
sonality (e.g., Miller & Dollard, 1941; Sears, 
1951). The mechanisms of identification and 
its effects are also of considerable concern to 
psychologists interested in personality and 
its development. The learning of attitudes, 
values, and affective responses through the 
medium of identification with models, par- 
ticularly parents, has been widely discussed. 
The learning of fears as the result of identi- 
fication with a parent who shows fear in 
the situation and its opposite—the extinction 
of fear in children as a result of their 
identification with a parent who is unafraid 
in a situation arousing fear in the child—have 
been relatively widely accepted. 

The primary purpose of the present study 
was to empirically test the hypothesis that 
decrease in fear in a fear-provoking setting 
which includes models showing little fear is 
related to identification with these models. 
In the present study, identification was as- 


1The data for this study were collected while the 
author was with the Veterans Administration Hos- 
pital, Hines, Illinois. Preparation of the manuscript 
was supported, in part, by funds from the Hall- 
Mercer Hospital Foundation, Institute of the Penn- 
sylvania Hospital Division. Advice from Kenneth 
I. Howard and Herman Diesenhaus on the semantic 
differential scales was greatly appreciated. Special 
thanks are due to K. Edward Renner for his many 
valuable suggestions throughout the preparation of 
this manuscript and to Winfred F. Hill for his 
critical reading of the paper and his useful comments. 


sessed through perceived similarity in a 
semantic differential measure, measured by 
the deviations of the S’s ratings of the identi- 
fication model and self-ratings as used by 
Lazowick (1955), Gray (1959), and Anisfeld, 
Munoz, and Lambert (1963). 

Specifically, the relationship between de- 
crease in fear of tuberculosis among student 
nurses assigned to a tuberculosis treatment 
unit and identification with selected key staff 
members working in the unit was evaluated, 
The reduction of fear in these late adolescents 
was considered analogous to the reduction 
of fear in children through identification with 
a parent who is unafraid in the situation 
which provokes fear for the child. If the 
analogy is justified, then decrease in fear of 
tuberculosis for the students should be related 
to identification with staff members. 

It was assumed that these experienced staff 
members show little fear. Since all the staff 
models rated by the Ss had worked with the 
tuberculosis patients for at least 5 years, this 
asumption seemed warranted. 


METHOD 
Subjects 


The Ss were 54 student nurses on tuberculosis 
training assignment at the Veterans Administration 
Hospital, Hines, Illinois. 


Measures 


Fear measure. Preexposure fear (P-fear) and fear 
decrease were measured by a Fear of Tuberculosis 
Questionnaire consisting of 28 incomplete sentence 
stems. Examples of stems used in the questionnaire 
are: “If a girl I was rooming with told me she had 
‘cured’ tuberculosis, I ____” and “Wearing a cap, 
mask, and gown to talk with or treat patients 
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makes me think ____.” The test previously showed 
test-retest reliabilities of .87 in a fear-provoking 
situation and .72 in a neutral situation (DeWolfe 
& Governale, 1964). ‘ 

Identification scores. As indicated earlier, identi- 
fication was measured by scores of perceived simi- 
larity between the Ss and the models on a semantic 
differential measure.? Specifically, the difference be- 
tween the S’s ratings of the model and her self- 
ratings in Osgood D scores (Osgood, Suci, & Tan- 
nenbaum, 1957) on the second administration of the 
semantic differential scales was the measure of 
identification. Maternal and paternal identification 
scores were the D scores of Mother/Me and 
Father/Me, 

In the identification with staff models score, 
the S’s self-ratings were compared with her ratings 
of the three staff models (i.e. the two instructors 
and the head nurse on the S’s floor). These indi- 
viduals were chosen because they were most 
frequently present when the students were in 
contact with the tuberculosis patients, Since the 
identification with staff models compared ratings 
of specific individual models with self-ratings, as in 


*The bipolar adjectives for the Evaluative di- 
mension were happy-sad, good-bad, predictable- 
unpredictable, social-unsociable, effective-ineffective, 
kind-cruel, wise-foolish, beautiful-ugly, pleasant- 
unpleasant, and valuable-worthless. The Potency 
dimension used hard-soft, dangerous-safe, masculine- 
feminine, and strong-weak. The Activity dimension 
consisted of tense-relaxed, fast-slow, and active- 
passive. The ratio of the numbers of scales in the 
Evaluative, Potency, and Activity dimensions (ie, 
10, 4, 3) was selected to approximate the amount 
of variance accounted for by each dimension in the 
factor analyses presented in Osgood, Suci, and 
Tannenbaum (1957). 

3 Identical semantic differential scales were given 
twice, about 1 week apart, as less response bias was 
expected in the second testing, based on Howard’s 
(1962) finding that a second test better differenti- 
ated individuals. The two response tendencies most 
likely to systematically affect identification scores 
from semantic differential measures are the “all-or- 
none” tendency (i.e. disproportionately high use of 
the +3, 0, and —3 scale points) and the social 
desirability tendency. To evaluate these effects in 
the second semantic differential, the S’s all-or-none 
scores (ie, total of +3, 0, and —3 scores on the 
entire semantic differential) and social desirability 
Scores (i.e, total of the S’s ratings of the three staff 
models on the 10 scales of the evaluative dimension) 
were correlated with the identification scores. The 
all-or-none Score did not correlate significantly with 
any identification score, and the social desirability 
Score correlated significantly with only one (identi- 
fication with mother), Both Positive and negative 
nonsignificant correlations with identification scores 
were found for both Tesponse-tendency scores. Thus, 
it appeared most unlikely that either of these re- 
Sponse tendencies had a strong systematic effect on 
the identification scores, 
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the parental identifications used in previous studies, 
these scores were considered analogous to parental 
identification measures. The actual score was a 
composite, computed by finding the square root of 
the sum of the squared deviations of the ratings 
of the model from self-ratings across all three staff 
models. This score was viewed as a composite of 
the identifications of the S with all three staff nurse 
models. The identification with nursing score was 
a comparison of the ratings of “Nurse” with self- 
ratings and was interpreted as a role identification 
rather than an identification with a particular model, 
To make the identification dimensions more di- 
rectly comparable to the other variables used in the 
study, all the identification scores have been reflected 
(i.e the poles of the identification dimensions have 
been reversed). Thus, throughout the Results and 
Discussion section, including the table, high scores 
indicate stronger identification rather than greater 
semantic distances and weaker identification. 


Procedure 


The Fear of Tuberculosis Questionnaire was ad- 
ministered to the Ss at the hospital, but prior to 
their assignment to ward units. The initial semantic 
differential scales and the second fear questionnaire 
were administered together after the Ss’ third week 
of training. The final administration of the semantic 
differential measure was approximately 1 week later. 
The Ss used only code numbers on all test protocols 
to assure them that they would remain anonymous. 


RESULTS AND DISCUSSION 


The Ss’ fear of tuberculosis decreased sig- 
nificantly during the first 3 weeks of their 
tuberculosis assignment (sign test z= 3.7, 


- P< .001). These results were parallel to the 


findings of an earlier study (DeWolfe & 
Governale, 1964) in which students’ fear de- 
creased significantly in the first 3 weeks, and 
this decrease was significantly greater than 
the decrease in a control group matched on 
initial level of fear, 


Relationships between Fear and Identification 


The relationships between the fear and 
identification measures are shown in Table 1. 
Since preexposure fear (P-fear) and fear- 
decrease scores correlated .66 (p < .001), all 
other correlations involving fear decrease are 
partial correlations with level of P-fear held 
constant. In addition, identification with 
nursing confounds the measure of identifica- 
tion with staff models, since part of a stu- 
dent nurse’s identification with a staff nurse 
model is based on her identification with 
nursing, independent of her identification 
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TABLE 1 


RELATIONSHIPS AMONG THE FEAR AND 
IDENTIFICATION MEASURES 


Measure 1 2 3 4 5 


Preexposure 
fear 

Fear 
decrease* 


Identification 01 
with staff 
models? 


Maternal iden- 03 «10 
tification 


Paternaliden- 13 —.07 „10 22 


tification 


Identification —.11 —.28*, «48% A6**.. 378? 


with nursing 


^ Since preexposure fear correlates very highly with fear 
decrease, the correlations of fear decrease with other variables 
are partial y's, with preexposure fear held constant. 

» Since identification with nursing confounds the measure of 
identification with staff models, all correlations involving the 
identification with staff models measure are partial correlations, 
with identification with nursing held constant. 

e As a consequence of the statistical procedures indicated in 
Footnotes a and b above, this correlation is a second-order 
partial 7, with both preexposure fear and identification with 
nursing held constant. 

*p <.05. 

**p <.01, 


with the model as an individual. Therefore, 
all 7s involving identification with staff 
models were partial 7’s with identification 
with nursing held constant. 

The r of .50 (p < .001) when fear decrease 
was related to identification with staff models 
was a second-order partial correlation with 
the effects of preexposure fear and identifica- 
tion with nursing both held constant. This 
finding supported the hypothesis. that fear 
decrease in a fear-provoking setting is related 
to identification with models in that setting. 

Similarly, the significant relationship be- 
tween identification with staff models in the 
fear-provoking situation and fear decrease, 
together with the lack of significant relation- 
ship between fear decrease and parental 
identifications, would add further support to 
the suggested necessity that the model be 
present in the fear-arousing situation. This 
identification of the student nurses with their 
parents or other identification models not 
present in the fear-provoking situation would 
not be expected to affect the reduction of 
the specific fear of tuberculosis directly, since 
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these models could not be perceived as low 
in fear of the situation. 

Although antecedent-consequent relation- 
ships between identification with staff models 
and fear decrease cannot be identified directly 
from these data, the lack of correlation be- 
tween preexposure fear and identification with 
staff models precludes the interpretation that 
the identification was originally motivated by 
fear of tuberculosis. Thus, the problem of 
what accounts for the identification in the 
first place remains. 

One possible way to account for the ob- 
tained relationship between identification and 
fear decrease combines Kagan’s (1958) anal- 
ysis of the usual process of identification and 
a “snowball” effect resulting from fear de- 
crease acting as a reinforcement for identi- 
fication, while identification in turn aids fear 
reduction. 

Kagan (1958) cited nurturance of the en- 
vironment as the principal positive goal states 
possessed by a model that would motivate 
identification with the model. He indicated 
that the identifier acts like a model to en- 
hance his perception of his similarity to the 
model, since this perceived similarity to the 
model enables him to be reinforced by vicari- 
ously sharing the model’s positive goal states. 
Identification in this formulation is the 
perceived similarity once it is achieved. 

As key figures in the student’s training, 
the staff models might be expected to be 
perceived by the students as having mastered 
the skills of nursing and particularly of 
caring for tubercuolsis patients, and also as 
natural sources of support, thus motivating 
identification with them. 

During the process of identifying, the stu- 
dent would be expected to perceive herself 
and to act less fearful (i.e., more like the 
models) to enhance the perceived similarity. 
In this situation, the student would not only 
get the usual reinforcement of vicariously 
sharing the model’s positive goal states, but 
would also receive the strong added rein- 
forcement of the drive reduction associated 
with the decrease in fear. This drive reduc- 
tion would reinforce the act of identifying, 
with a resultant increase in the strength of 
identification. The increase in identification 
would further decrease fear through increased 
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generalization of perceived similarity, thus 
generating and accelerating the snowball 
effect. 

Although the above formulation is only one 
of many that could explain the relation be- 
tween identification and fear decrease, it has 
the added advantage of being useful in ex- 
plaining other findings in the present study. 
For example, the strength of these Ss’ identi- 
fications with staff models after only about 
a 3-week acquaintance was almost as great 
as their parental identifications, and the Ss’ 
fears decreased markedly in the same brief pe- 
riod even though they remained in the fear- 
provoking setting. The snowball-effect formu- 
lation would account for these unusually 
rapid and extensive changes in fear and 
identification, in addition to explaining the 
obtained relationship between identification 
with staff models and fear decrease. The 
snowball-effect analysis seems plausible, and 
it integrates and is consistent with the find- 
ings in the present study, but the formulation 
was purely post hoc and will have to be 
confirmed or refuted by further research. 


Relationships among Identification Measures 


Table 1 shows the relationships among the 
identification measures. Parental identifica- 
tions were assessed for several reasons, The 
comparisons of the relationships between 
fear decrease and identification with parents 
who were not present in the fear-arousing 
situation with the relationship between 
fear decrease and identification with staff 
models who were present have already been 
mentioned, 

Since the Ss’ identifications with staff 
models were viewed as analogous to parental 
identifications, it seemed appropriate to study 
the relationships among these identifications, 
The correlation of maternal identification 
with identification with staff models was sig- 
nificant (r = .33, p < .05), even after removal 
by partial correlation of the effects of identi- 
fication with nursing, which was highly cor- 
related with both variables. Before the effects 
of identification with nursing were removed, 
maternal identification was correlated 46, 
$ < .001, with identification with staff models. 
The assumed analogy between parental and 
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staff identifications was supported by these 
significant relationships between maternal 
and staff-model identifications. 

The hypothesized analogy between parental 
and staff identifications was further sup- 
ported by the findings that staff-model, 
maternal, and paternal identifications all were 
significantly related to identification with 
nursing. This essential equivalence in con- 
current validity of the staff and parental 
identification scores, when related to identi- 
fication with nursing, indicated similarity and 
gave additional support to the use of the 
analogy. 

The relationships among parental identi- 
fications and the current vocationally oriented 
identifications of the Ss have general interest 
independent of their specific concern with 
the hypotheses being tested in the present 
study. Identification with nursing was signifi- 
cantly related to both maternal (r= .46, 
$ < .001) and paternal (r= 37, p< .01) 
identification. Since nursing is usually viewed 
as a feminine profession, at least by laymen, 
the significant relationship between maternal 
and nursing identifications seemed natural. 
The significant relationship between paternal 
and nursing identifications in this context, 
however, would not be predicted so easily. 
The paternal and nursing identification rela- 
tionship found in the present study could 
indicate that further exploration of paternal 
identification and the vocational attitudes of 
females might be fruitful. 

The present study presented an example of 
the continuing importance of identification 
for late adolescents and of the impact of 
models other than their parents. Relation- 
ships among parental and vocationally ori- 
ented identifications were also explored. The 
hypothesis that fear decrease in a fear- 
arousing situation is specifically related to 
identification with models showing low fear 
in the situation was supported by the results. 
A need for further research to clarify cause 
and effect relationships was indicated. 
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ACCURACY VERSUS VALIDITY IN PERSON 
PERCEPTION * 


RAYMOND E, FANCHER, Jr. 
University of Rochester 


24 undergraduate psychology students were asked to predict events from 3 
different case histories and then to write personality conceptualizations for 
the 3 cases. The predictive validity of the conceptualizations was measured by 
having new judges read them and make a new series of predictions about the 
same cases. The accuracy of the original judges in making their predictions 
was significantly negatively correlated with the predictive validity of their 
conceptualizations; that is, the most accurate predictors tended to write the 
Teast valid conceptualizations. There were indications that the accurate pre- 
dictors adopted an empathic approach to the case material, while valid con- 
ceptualizers were more analytic and evaluative in approach, Slight evidence 
was presented to the effect that training in psychology was negatively related 
to accuracy, but positively related to validity. 


This research focused upon two functions 
that are crucial to all workers in the field of 
psychological personality theory and espe- 
cially to practicing clinical psychologists, 
namely, the accurate prediction of individual 
behavior and the communication of valid 
conceptualizations of individual personalities, 
The major purpose of this research was to 
compare the ability of undergraduate psychol- 
ogy students to predict accurately the ma- 
terial from a series of case studies with their 
ability to conceptualize validly that same ma- 
terial. 

At first thought, it may seem that these two 
abilities should be highly correlated. The ac- 
curate prediction of a person’s behavior and 
the formulation of a valid conceptualization 
of his personality are both tasks that require 
an “understanding” of that person; it might 
be expected that a judge who understands well 
enough to predict accurately will also under- 
stand well enough to conceptualize validly. 
Yet Taft (1955) has suggested that the two 
tasks may call for qualitatively different cog- 
nitive operations, He suggested that the pre- 
diction of individual behavior calls for an 
empathic, “nonanalytic” approach to the case 

1 This research was supported by National Insti- 
tute of Mental Health Grant F1 MH20, 904-01. It 
represents part of the research included in a doctoral 
dissertation presented to the faculty of the Depart- 
ment of Social Relations, Harvard University. Sin- 
cere thanks are due to Robert Rosenthal, who super- 


vised the research, and to Robert W. White, who 
provided the case materials that were used. 


material, while the formulation of personality 
descriptions or conceptualizations calls for an 
inferential, “analytic” approach. Since Taft 
viewed analytic and nonanalytic approaches 
as being independent (or perhaps contradic- 
tory), his reasoning leads to the hypothesis 
that accuracy in prediction and validity in 
conceptualization should not be positively 
correlated. 

The present study was designed to test this 
hypothesis and, in addition, to determine some 
of the characteristics of accurately-predicting 
and/or validly-conceptualizing judges. 

This research bears an oblique but poten- 
tially important relationship to studies mea- 
suring the adequacy of clinical psychologists’ 
functioning on person-perception tasks. The 
self-esteem of clinicians has suffered in recent 
years as a result of a number of studies indi- 
cating that their predictive accuracy is some- 
thing less than impressive. Meehl (1954, 1965) 
has reviewed more than 50 studies, the vast 
majority of which indicated that statistically 
generated predictions were more accurate than 
the predictions of clinicians. Taft’s (1955) 
review indicated that the interpersonal pre- 
dictions of psychologically trained judges 
were, on the average, somewhat less accurate 
than the predictions of nonpsychologists. 
Thorne’s (1961) attempt to classify the major 
types of clinical error and to prescribe safe- 
guards against them may be viewed as a re- 
flection of the concern that these studies have 
produced in the clinical community. 
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These studies may not necessarily repre- 
sent a blanket indictment of the interpersonal 
sensitivity of clinicians, however. Many of 
the clinical psychologist’s major functions, 
notably the formulation of valid psychological 
reports or the administration of “insight 
psychotherapy,” appear to call primarily for 
the conceptualization and clarification of per- 
sonality data, rather than for explicit predic- 
tions. Thus if the ability to predict and the 
ability to conceptualize are not positively cor- 
related, clinicians may yet demonstrate that 
their greatest competence lies in the concep- 
tualizing of personality. 


METHOD 
Judges 


Two groups of judges were employed in this re- 
search, The first group (Jis) consisted of 24 male 
undergraduates at Harvard University who were 
enrolled in a course in abnormal psychology. A 
formal prerequisite for enrollment in this course was 
the completion of a course in personality theory, so 
each of Jis was presumed to have had at least a 
basic knowledge of the major approaches to person- 
ality theory. It was the J1s whose ability to predict 
and conceptualize personality data was assessed in 
this study. 

The second group of judges (J2s) were 72 male 
Harvard undergraduates who had almost completed 
a course in personality theory at the time of their 
Participation in the study. Their function was to 
validate the personality conceptualizations that had 
been written by the Jis. 


Measure of Predictive Accuracy 


The predictive accuracy of the J1s was assessed by 
Means of a “programmed case” technique, as devised 
by Dailey (1963). In a programmed case, judges are 
asked to select from a multiple-choice format a series 
of “true” events in a given real individual’s life. 
After each choice (i.e, deciding which one of a 
group of events is true for the subject of the case), 
the judge is informed as to which event was, in fact, 
true, In the early portions of a programmed case, 
the judge has very little or no information on which 
to base his choices, and his accuracy in selecting the 
true events is largely a matter of chance. As he pro- 
ceeds through a case, however, the feedback pro- 
vides him with increasing factual information about 
the subject. This information should theoretically 
enable a judge to make more accurate choices in the 
later portions of the programmed case. It was an- 
ticipated in this study that judges would demon- 
strate individual differences in their ability to use 
the feedback and make accurate choices. 

Three programmed cases were constructed, based 
on three case histories obtained by Robert W. White. 
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The material in the case histories was similar to 
that presented in White’s published (1952) case his- 
tories, though the cases themselves were different. 
Thirty-nine “events” were abstracted from each case 
and used as the true events in the programmed cases. 
Each of these true events was grouped with two 
other events that were abstracted from other case 
histories. The task of the J1s was to select from the 
programmed-case format the true event in each 
group of three. Measures of their predictive accuracy 
were obtained by counting the number of correct 
choices that were made on each case. 

It was determined that the Jis were highly varia- 
ble in the accuracy with which they were able to 
predict the true events. Some Jis were correct on 
less than 33% of their choices, their accuracy thus 
falling below the level that would be expected if 
they guessed randomly. Other Jis were accurate on 
60-80% of their choices, demonstrating a high de- 
gree of predictive accuracy. The mean number of 
correct choices per programmed case was 16.6, or 
42.5% of the total choices. It was also determined 
that the accuracy levels obtained by a given J1 
tended to be consistent across all three of the pro- 
grammed cases. Intercorrelations of the J1 accuracy 
scores across the three cases were .76, .69, and .84, 
all significant beyond the .001 level of confidence. A 
total accuracy score was obtained for each J1 by 
summing his accuracy scores on the three pro- 
grammed cases. 

Further information about the accuracy scores and 
the construction of the programmed cases is available 
in an earlier publication by the author (Fancher, 
1966). 


Measure of Predictive Validity 


Upon completing each programmed case, each J1 
was asked to write a conceptualization of the case 
subject’s personality, using any technique that 
seemed appropriate. Thus 72 different conceptualiza- 
tions were written (3 conceptualizations from each 
J1). These conceptualizations were assessed for 
their predictive validity in the following manner. 

A “validating questionnaire” was constructed for 
each of the three cases by abstracting 24 additional 
events from each of the case histories. Each of these 
events was grouped with two alternative events, as 
had been done for the programmed cases. The vali- 
dating questionnaires did not, however, provide feed- 
back about the correct choices. Each validating 
questionnaire was in essence a 24-item multiple- 
choice test. 

The experimental task for the J2s was to read the 
J19 conceptualizations and then fill out the validat- 
ing questionnaires on the basis of what they had 
gleaned from the conceptualizations. Presumably, the 
more adequate conceptualizations enabled the J2s to 
make more accurate choices on the validating ques- 
tionnaires. 

Each conceptualization was read by three different 
J2s, each of whom subsequently filled out the ap- 
propriate validating questionnaire. In this way three 
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different validity scores were obtained for each con- 
ceptualization (the number of correct choices by 
each J2) and nine scores for each conceptualizer 
(the scores on the three conceptualizations written 
by each J1). 

The mean validity score attained by the J2s was 
8.5 correct choices per validating questionnaire. 
Since a mean of 8.0 correct choices would be ex- 
pected if the J2s guessed randomly, it is obvious 
that the conceptualizations taken as a group did not 
have a high degree of predictive validity. However, 
a confidence interval constructed about the obtained 
mean of 8.5 indicated that it exceeded the chance 
figure of 8.0 with a probability of more than .99. 

To assess the consistency of the validity scores 
attained by each J1, an intraclass-correlation coeffi- 
cient (rı) was computed for all nine of the J19 
validity scores pooled together. The rı was .07, sig- 
nificant at the .05 level by means of an F test (F 
= 1.69, df =23/192). There was a slight but sta- 
tistically significant tendency for individual Jis to 
consistently write relatively valid or invalid con- 
ceptualizations.? 

A total validity score was obtained for each J1 
by summing the nine individual validity scores re- 
sulting from his conceptualizations. 


Measures of J1 Characteristics 


Immediately prior to working on the programmed 
cases, each J1 took a shortened group form of 
Kelly’s (1955) Role Construct Repertory Test (Rep 
test), as described by Bieri and Blacker (1956). The 
Rep tests were employed primarily to determine the 
extent to which individual Jis tended to use “ob- 
jective” or “evaluative” constructs in categorizing 
themselves and their acquaintances. Each Rep test 
was scored for objectivity and evaluation by count- 
ing the number of objective or evaluative constructs 
it contained. Constructs were defined as objective if 
they could be easily verified by an external observer 
(referring to age, sex, occupational role, etc.), or if 
objective substantiation or clarification was pro- 
vided (e.g., “religious—goes to church every week”). 
Evaluative constructs were defined as containing 
clearly evaluative modifiers, or as making reference 
to mental health, or as having clearly evaluative 
connotations (e.g,, “attractive,” “stingy,” etc.). Two 
independent judges scored all 24 Rep tests for both 
objectivity and evaluation. Interjudge r’s for the two 
Measures were .85 and .83, indicating a satisfactory 
level of scoring reliability. The objectivity and eval- 
uation measures were not found to be completely 
pear of one another, since the z between them 
was —.61, 


2In an attempt to rule out the possibility that 
the validity scores were merely artifactual functions 
of the lengths of the conceptualizations, correlations 
were computed between the number of words con- 
tained in the conceptualizations and the validity 
scores they obtained. The r’s for the 3 cases were 
.20, .06, and —.19, all significant. 
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In each Rep test, each J1 was asked on 10 occa- 
sions to consider himself in relation to two of his 
acquaintances and to indicate which two of the three 
persons were most alike. It was hypothesized that a 
tendency on the part of a J1 to consistently cate- 
gorize himself as different from the others (i.e. to 
indicate that the other two persons were most alike) 
might reflect another tendency on his part to view 
himself as different from other people in general. 
This could possibly influence his method of predict- 
ing and/or conceptualizing, so a count was made of 
the number of times each J1 categorized himself as 
the different one on the Rep test sorts. 

The preferences of the J1s for differing approaches 
to personality theory were assessed by means of a 
questionnaire which listed the 18 variables used by 
Hall and Lindzey (1957) in comparing the various 
theorists discussed in their textbook. These variables 
were (a) purposiveness—the teleological nature of 
behavior; (b) unconscious determinants; (c) re- 
ward as a prerequisite to learning; (d) learning by 
contiguity; (e) the learning process; (f) focus on 
the stable, structural aspects of personality; (g) 
heredity; (+) early developmental experience; (i) 
continuity of development from birth through ma- 
turity; (j) organismic emphasis; (k) field emphasis; 
(I) the uniqueness of personality; (m) psychological 
environment—the subjective frame of reference; (n) 
self-concept; (o) group-membership determinants; 
($) interdisciplinary emphasis on biology; (q) in- 
terdisciplinary emphasis on the social sciences; and 
(r) multiplicity of motives. Each J1 was asked to 
indicate, on a 3-point scale, how useful he found 
each of these approaches or emphases to be in deal- 
ing with the cases (usefulness ratings) and how im- 
portant he felt each of them to be in his own per- 
sonal approach to personality theory (espousal rat- 
ings). 

Information was also collected about each J1 con- 
cerning his College Entrance Examination Board 
scores, the number of prior courses he had taken in 
psychology, and the grade that he ultimately received 
in the abnormal psychology course from which he 
had been recruited. 


RESULTS AND DISCUSSION 
Relationship between Accuracy and Validity 


The correlation measuring the relationship 
between the Jis’ total accuracy and total 
validity scores was computed, and the 7 was 
found to be —.41 (p < .05, two-tailed test). 
In an effort to determine whether this nega- 
tive correlation could be attributed to a dis- 
proportionate effect from just one of the cases, 
individual correlations were computed be- 
tween the accuracy and validity scores for 
the three cases separately. The 7’s were —.40, 
—.27, and —.20. Though only the first of 
those 7’s was statistically significant (p < .05), 
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TABLE 1 
CORRELATES OF ACCURACY AND VALIDITY 


Variable 


Psychological environment (U) 
Rep test-objectivity® 
Rep test-evaluation* 


Structural aspects (U) 
Structural aspects (E) 
Heredity (U) 

Emphasis on social science (E) 
Rep test-isolation of self 


Multiplicity of motives (U) 
Learning by contiguity (U) 
Early developmental experience (E) 


p p for 
r with r with difference 
accuracy validity between 7’s 
.45* —.40* .005 
-60** —.59** 001 
—.40* 36134 .001 
.40* .01 ns 
.44* —.24 .05 
.50* —.13 .05 
—.49* .05 ns 
—.45* 16 05 
.09 —.42* ns 
.09 —.48* .05 
31 .46* ns 


Note.—Abbreviated: (U) = usefulness ratings; (E) = espousal ratings. All Ns = 24. 


è Seance for attenuation due to scoring unreliability. 

** p < 01, 
the fact that all of them were negative sug- 
gests that there was a generalized and mod- 
erate inverse relationship between the ability 
accurately to predict personality data and the 
ability to conceptualize that same data with a 
degree of predictive validity. 


J1 Characteristics Correlated with Accuracy 
and Validity 


The Ji characteristics that were found to 
Correlate with total accuracy and/or total 
validity are presented in Table 1, and they 
may be subdivided into three categories: those 
correlated significantly with both accuracy 
and validity, those correlated with accuracy 
but not with validity, and those correlated 
with validity but not with accuracy.* Of the 
36 theoretical emphasis variables (18 useful- 
hess and 18 espousal ratings), 5 were signifi- 
cantly correlated with accuracy and 4 with 
validity (p < .05, two-tailed). 

The top portion of Table 1 lists the three 
variables that were significantly correlated 
with both accuracy and validity. In all three 
cases the signs of the r’s were reversed for 
accuracy and validity, suggesting that these 

3A more complete analysis of the correlates of 
accuracy has been presented in an earlier publica- 
tion (Fancher, 1966). The present paper is con- 
cerned with the correlates of accuracy only as they 
compare with the correlates of validity. 


variables may lie close to the heart of the 
difference between accurately-predicting and 
validly-conceptualizing Jis. Accurate Jis 
tended to emphasize the subjective frames of 
reference (psychological environments) of the 
subjects of the cases and to employ objective 
and nonevaluative constructs in their Rep 
tests. These results would appear to corrob- 
orate the popular image of a sensitive inter- 
personal judge as a person who is “fair” in 
his thinking about others and who tries to 
see things from the other fellow’s point of 
view. The “psychological environment” find- 
ing also suggests that accurate Jis may have 
adopted an “empathic” attitude toward the 
cases and this attitude may have been espe- 
cially helpful in the prediction of behavior, 
which Taft (1955) has described as a non- 
analytic, empathic task. 
Validly-conceptualizing Jis, on the other 
hand, did not emphasize the subjective frames 
of reference of the subjects of the cases and 
tended to be evaluative and nonobjective on 
their Rep tests. These results suggest that 
these Jis, far from adopting an empathic ap- 
proach to the cases, may have tended to im- 
pose their own preexisting and subjective— 
but well-systematized—category systems upon 
the case material, Thus they may have re- 
garded the cases analytically, as objects to be 


268 Raymonp E, FANCHER, JR. 


evaluated and categorized. This kind of ap- 
proach may have had the advantage of facili- 
tating systematic formulations of the cases 
and the disadvantage of inhibiting an intui- 
tive “feel” for the cases. 

The middle portion of Table 1 lists those 
variables that were significantly correlated 
with accuracy but not with validity. Accurate 
Jis rated themselves as emphasizing, both in 
belief and in practice, the stable structural 
aspects of personality. It is likely that they 
tended to seek out consistent traits in the 
data from the cases. They also avowedly 
found it useful to emphasize the hereditary 
aspects of the case materials. On the basis of 
informal conversations with some of the Jis, 
it appeared to the experimenter that a high 
rating on the heredity variable was also indic- 
ative of a concern with family background 
factors such as socioeconomic status, religion, 
etc. These factors may have been important 
determinants of the accurate J1s’ predictions. 
Accurate Jis did not espouse an interdisci- 
plinary emphasis on social science, perhaps 
suggesting that they preferred to maintain a 
focus on the individual personalities repre- 
sented in the cases, rather than to abstract 
social variables from a large number of cases. 
The finding that accurate Jis tended to 
see themselves as similar to others is interest- 
ing, because such an attitude might facilitate 
the adoption of an empathic approach to the 
prediction of behavior. A judge who assumes 
that he is similar to other people in general 
may feel comfortable in basing his predictions 
on his own imagined responses to particular 
situations, 

The variables that were correlated with 
validity but not with accuracy are listed at 
the bottom of Table 1. Validly-conceptualiz- 
ing J1s said they found it useful to emphasize 
few motives, to deemphasize learning by con- 
tiguity, and they espoused an approach to 
personality that emphasized the importance 
of early developmental experience. 

While the exact interpretation of the mean- 
ings of these individual findings may be open 
to some question, one clear fact that has 
emerged is that prediction and conceptuali- 
zation are definitely different tasks, calling 
for different and sometimes mutually exclusive 
attitudes and abilities on the part of the Jis. 


Future research efforts might profitably be 
directed toward the specification and elabora- 
tion of these attitudes and abilities. 


Incidental Findings and Implications for 
Clinical Psychology 


The results of this research, while derived 
from a population of undergraduate psychol- 
ogy students, are not inconsistent with the 
notion that clinical psychologists may be rela- 
tively valid conceptualizers of personality. 
Accurate prediction was found here to be 
negatively correlated with valid conceptuali- 
zation, and it may be that this same relation- 
ship would prevail for a judge population 
that included clinical psychologists. Since 
clinicians have tended to do poorly on predic- 
tion tasks, they may tend to do well on tasks 
of conceptualization. 

A very indirect test of this speculation was 
undertaken by examining the performance of 
the Jis in this study who might be expected 
to be most like clinical psychologists. Two 
possible measures of similarity were employed. 
First, since clinical psychologists have had a 
great deal of psychological training, it might 
be supposed that those Jls who had taken 
the greatest number of psychology courses 
would be most like professional clinicians. 
Thus an experience score was computed for 
each J1 by counting the number of psychol- 
ogy courses he had taken. Second, the grades 
obtained by the Jis in the abnormal psychol- 
ogy class from which they had been recruited 
were taken as another possible measure of 
similarity, since Jls with high grades might 
be assumed to be most competent in psychol- 
ogy and, perhaps, most likely to go to gradu- 
ate school in clinical psychology. The ex- 
periential factor was not significantly related 
to accuracy or validity; the 7’s between num- 
ber of courses taken in psychology and ac- 
curacy and validity were —.17 and .05, re- 
spectively. 

Course grades in abnormal psychology were 
found to be negatively though insignificantly 
related to accuracy (r= —.31, p < .15) but 
positively and significantly related to validity 
(r= 47, p< .05). The difference between 
the two 7’s was significant at the .01 confi- 
dence level. It is also worth noting that the 
Jis’ verbal and quantitative College Entrance 
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Examination Board aptitude scores were not 
significantly correlated with validity (7’s = .05 
and .02, respectively), indicating that the cor- 
relation between grades and validity cannot 
be explained away simply as a reflection of a 
relationship between generalized scholastic 
aptitude and validity. These data provide a 
slim measure of support for the notion that 
psychological competence leads to valid con- 
ceptualization—if not to accurate prediction. 
The final test, of course, must await the as- 
sessment of the conceptualizing ability of prac- 
ticing clinical psychologists. 
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Separate factor analyses were performed on the interview, ward behavior, and 
self-report ratings of 124 depressed patients from 10 hospitals. The major 
categories of psychopathology discernable from these analyses were: (a) in- 
terest and involvement in activities, (b) hostility, (c) feelings of guilt and 
worthlessness, (d) anxiety-tension, (e) sleep disturbance, (f) somatic com- 
plaints, (g) retardation in speech and behavior, (h) conceptual disorganization, 


and (i) depressive mood. 


The present study was undertaken to iden- 
tify relatively independent factors of psycho- 
pathology in hospitalized depressed patients. 
These factors are to be used as change or cri- 
terion measures in a collaborative study of 
drug treatment in depression. An effort was 
made to include all of the major primary and 
secondary symptoms regarded as character- 
istic of depression. Ratings of the presence 
and severity of these symptoms were obtained 
from psychologists and psychiatrists following 
an interview with the patient, from the pa- 
tients themselves on a self-report form, and 
from nurses’ ratings of patient ward behavior. 
Consequently, the study also afforded a unique 
opportunity to compare and contrast factors 
of psychopathology that emerged from these 
three sources of information. 


METHOD 
Subjects 


The study sample consisted of 42 male and 82 
female patients drawn from the psychiatric popula- 
tions of two large metropolitan receiving hospitals, 


1 Prior investigators who have factor analyzed the 
symptoms of depressed psychiatric patients include 
Cowitz, Cohen, and Friedman (1963), Grinker, Mil- 
ler, Sabshin, Nunn, and Nunnally (1961), Hamilton 
and White (1959), Kiloh and Garside (1963), Lorr, 
Klett, and McNair (1963), Overall (1962), and 
Wittenborn (1962). 

2 District of Columbia General Hospital, Wash- 
ington, D. C.; Malcolm Bliss Mental Health Center, 
St. Louis, Missouri. 


three state hospitals, and four private institutions.* 
Newly admitted patients between 16 and 70 years of 
age and with no evidence of mental deficiency, liver 
damage, unequivocal brain damage, cardiovascular 
disease, or epilepsy were admitted to the study. 
Project coordinators (a psychiatrist or psycholo- 
gist) at each hospital also rated patients on amount 
of depression in verbal report, behavior, and sec- 
ondary symptoms of depression. Each of these 
three items was rated on five-point intensity scales. 
For admission to the study a patient had to achieve 
a total score of at least nine, which would be the 
equivalent of a moderate amount of depression." 


8John Umstead Hospital, Butner, North Caro- 
lina; Rochester State Hospital, Rochester, Minne- 
sota; Rochester State Hospital, Rochester, New 
York. 

4 Hartford Hospital, Hartford, Connecticut; 
Mercy-Douglass Hospital, Philadelphia, Pennsyl- 
vania; Philadelphia Psychiatric Center, Philadelphia, 
Pennsylvania; The Sheppard and Enoch Pratt Hos- 
pital, Towson, Maryland. 

5 One of the major aims of this study was to de- 
velop criterion measures for evaluating the differ- 
ential effects of certain drugs on the major symp- 
toms of depression in a broad sample of hospitalized 
psychiatric patients. Consequently, we selected the 
Presence of at least a moderate amount of depres- 
sion in verbal report, behavior, and the secondary 
symptoms of depression, rather than certain assigned 
diagnoses, as our criterion for inclusion in the study. 
Further, we were concerned about the reliability of 
Psychiatric diagnoses. For example, Katz (1964) 
notes three types of defects in the current psychi- 
atric diagnostic system. These include getting two 
clinicians to agree on the diagnosis of a given pa- 
tient (Beck, Ward, Mendelson, Mack, & Ebergh, 
1962), inconsistencies in the assignment criteria, 
(Ward, Beck, Mendelson, Mack, & Ebergh, 1962) 
and the assigning of very different kinds of patients 
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Procedure hid 


Patients were interviewed by two psychiatrists or 
a psychiatrist and a psychologist. Following this 
interview, each interviewer rated the patient on a 
41-item inventory of psychic and somatic com- 
plaints (IPSC). The latter is a modification of the 
Symptom Distress Scale (Frank, Gliedman, Imber, 
Nash, & Stone, 1957) with additional items sampling 
secondary symptoms of depression. These ratings 
were made on seven-point degree-of-discomfort scales. 
All of the ratings for the present study were ob- 
tained before the patient’s fourth day in the hos- 
pital and prior to his assignment to one of three 
drug-treatment groups. 

Patients rated themselves on a 53-item version of 
the IPSC. The 12 additional items in the patient 
IPSC, as compared to the psychiatrist IPSC, were 
mainly specific somatic complaints. In addition, pa- 
tients completed a 52-item adjective checklist (mood 
scales). These items were gleaned from a variety of 
sources with particular emphasis given to adjectives 
which have reflected shifts in patient mood as a 
function of drug treatment (Clyde, 1963) and have 
differentiated depressed patients from normals 
(Friedman, 1964). Items on the patient IPSC and 
the mood scales were rated on four-point intensity 
scales, Finally, two nurses rated each patient on the 
Ward Behavior Rating Scale (WBRS; Burdock, 
Hardesty, Hakerem, & Zubin, 1960). The WBRS 
was modified so that each item would be rated on a 
four-point frequency-of-occurrence scale. In the 
original version, each checklist item is rated present 
or absent. 

The WBRS contained 153 items. To facilitate the 
factor analysis of the WBRS and to avoid certain 
computational pitfalls when there are fewer rows 
(patients) than columns (items) in the matrix being 
factor analyzed, it was decided to eliminate items 
with variances of .05 or less. The 41 items meeting 
this criterion and eliminated from the WBRS factor 
analysis were mainly bizarre behaviors character- 
istic of some chronic schizophrenics, for example, 
“wet bed or clothing (incontinent),” “play with 
genitals.” 

The procedures for factor analyzing the IPSC 
(psychiatrist), IPSC (patient), mood scales, and 
WBRS were identical.6 As all items were rated on 
4-9-point scales, they were correlated using the 


to the same category (Katz, Cole, & Lowery, 1964). 
Although a diagnosis of depression was not a neces- 
Sary prerequisite for inclusion in the study, the 
breakdown of initial diagnoses by the project co- 
ordinators was as follows: involutional psychotic 
reaction (n= 14); manic depressive reaction, depres- 
sive type (n=13); psychotic depressive reaction 
(n= 16); schizophrenic reaction, schizo-affective 
type (n= 18); schizophrenic reaction, other (n= 
6); depressive reaction (= 54); other neurotic 
reactions (n = 3). 

6 Statistical computations were performed at the 
Biometric Laboratory, George Washington Univer- 
sity, Washington, D. C. 


Pearson product-moment method. The correlation 
matrices were then factor analyzed using Hotelling’s 
principal components method with unity in the 
diagonals. To obtain relatively independent factors, 
a normal Varimax rotation, which provides an 
orthogonal solution, was performed on all factors 
with eigenvalues of one or greater. Additional normal 
Varimax rotations were then performed to succes- 
sively reduce the number of factors in step intervals 
of one to a two-factor solution. This procedure 
permitted us to select a “best” factor solution for 
each instrument. “Best” meant selecting the factor 
solution which maintained the independence of the 
major dimensions of psychopathology which seemed 
to be present in a particular instrument, at the 
same time ensuring the presence of at least three 
significant items in each major factor, Item signifi- 
cance was arbitrarily defined as a loading of .40 
or higher on one factor and no loading of .40 or 
higher on any other factor. For inclusion in a 
factor an item’s mean correlation with all other 
items in a factor also had to be statistically signifi- 
cant (r= > .20; p< .05).7 Three items failed to 
meet this relatively lenient criterion of internal con- 
sistency with other items in a factor, The solutions 
in six factors were selected for the IPSC (psychia- 
trist), mood scales, and WBRS. The solution in 
seven factors was selected for the IPSC (patient). 


RESULTS 


The three or four items with highest 
loadings on factors derived from the IPSC 
(psychiatrist), IPSC (patient), mood scales, 
and WBRS are presented in Tables 1, 2, 3, 
and 4, respectively.® 

Item content suggested a grouping of fac- 
tors from the various instruments into nine 
categories. These were labeled interest and 
involvement in activities, hostility, feelings 
of guilt and worthlessness, anxiety-tension, 
sleep disturbance, somatic complaints, retar- 
dation in speech and behavior, conceptual 
disorganization, and depressive mood. 


7The mean correlation of an item with the other 
items in a factor was computed by converting the 
individual item correlations to z scores, obtaining an 
average z score, and converting this value back to a 
correlation coefficient. 

8 Tables A through D, containing all of the sig- 
nificant items in factors derived from the patient 
and psychiatrist IPSC, mood scales, and WBRS have 
been deposited with the American Documentation 
Institute. Order Document No. 9390 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington, D. C. 
20540, Remit in advance $1.25 for microfilm or 
$1.25 for photocopies, and make checks payable to: 
Chief, Photoduplication Service, Library of Con- 


gress. 


272 RASKIN, SCHULTERBRANDT, REATIG, AND RICE 


TABLE 1 


Key Irems on THE SIX Factors DERIVED FROM THE INVENTORY OF PSYCHIC AND 
Somatic COMPLAINTS (PSYCHIATRIST) 


Factor loading 


Item 

No. Item content 1 2 3 4 5 6 

31. No interest in things —82 —10 —13 12 02 —09 

54. Loss of interest in activities —81 03 —16 12 —05 —03 
previously enjoyed 

53. Everything is an effort —80 —01 02 19 —15 —08 

37. Low in energy or slowed down =75 —22 —08 —05 —26 —20 

42. Impulses to beat/harm someone 07 79 —12 03 —09 10 

24. Feels critical of others 12 64 —21 07 —05 —27 

39. Impulses to smash things —02 60 —22 07 —09 —04 

33. Temper outbursts 09 59 04 —12 —04 —00 

35. Thoughts of ending life —09 34 —70 07 —05 03 

55: Thoughts of death and dying —22 38 —66 04 —02 —03 

40. Unable to get rid of bad thoughts 05 —02 —62 34 —03 —15 

25. Blames self for things done/ 10 —36 —58 —00 —25 —20 
not done 

47. Does things slowly to do them —22 —08 14 68 —27 —26 
right 

38. Suddenly scared for no reason —04 05 —23 67 09 —11 

49, Has to check and double check —07 —19 13 62 —21 —13 
what he does 

27. Tense, keyed up —34 01 —18 56 —08 —05 

20. Sleep restless/disturbed —25 06 —10 11 —80 09 

ZA Awake in early morning/difficulty —36 11 —10 28 —64 —09 
falling back asleep 

18. Difficulty falling asleep —31 26 —08 01 —63 —04 

28. People are unfriendly/dislike him 04 16 —13 —01 06 —76 

Al, Painfully self-conscious —29 02 —06 25 —02 —73 

30. Others do not understand/ —01 27 16 —05 —09 Sri 
unsympathetic 

21. Feels inferior to others w05. mi3. 27 27 16 —69 


Percentage of variance 


11.8 7.5 75 9.3 6.9 8.5 


Note.—As all factor loadings are less than one, decimal points have been omitted. 


Interest and Involvement in Activities 


Factors characterizing the patient’s energy 
level and involvement in on-going activities 
emerged from the interview, ward behavior, 
and self-report ratings. These were Factor Ẹ 
Anergia on the IPSC (psychiatrist); Factor 
1, Anxious Depression on the IPSC (pa- 
tient); Factor 1, Social Participation on the 
WBRS; and Factor 1, Energetic, on the mood 
scales, £ 

The key items in these factors. reflected a 
loss of interest in activities, feeling slowed 


down or lacking in energy, and a failure to 
participate in ward and other hospital activi- 
ties. Factor 1 of the psychiatrist IPSC, Fac- 
tor 1 of the patient IPSC, and Factor 1 of 
the WBRS also accounted for the greatest 
percentage of item variance on these instru- 
ments, 11.8, 13.0, and 13.5, respectively. 

On the IPSC (patient), Factor 1 also con- 
tained items with significant factor loadings 
which reflected depressive affect and anxiety, 
for example, “feeling blue,” “feeling tense or 
keyed up,” and “feeling fearful.” Conse- 
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TABLE 2 


Key ITEMS ON THE SEVEN Factors DERIVED FROM THE INVENTORY OF PSYCHIC AND SOMATIC 
COMPLAINTS (PATIENT) 


Factor loading 
Item 
No. Item content 1 2 3 4 5 6 7 
20. No interest in things —85 01 19 —04 04 05 10 
43, Everything an effort —82 24 00 —18 14 —06 10 
44. Loss of interest in activities —80 05 07 —10 12 19 12 
previously enjoyed 
21. Low in energy, slowed down —74 15 —01 —06 17 —22 28 
27. Critical of others —13 72 12 —15 10 01 02 
16. Easily annoyed, irritated —32 6 —il —10 11 —25 —04 
8. Others don’t understand, are —12 62 16 —27 29 —13 —30 
unsympathetic 
36. Impulses to harm someone —06 57. —02 o6 —07 —06 15 
6. Unable to get rid of bad thoughts —12 04 0 -23 23 —25 01 
and ideas 
At Blames self for things done or not —22 33 51 —12 —07 01 09 
done 
29, Feeling people are watching or —10 37 47 -13 —02. —06 00 
talking about you 
14. Faintness or dizziness —10 09 29 ~—66 03 —06 09 
16 Check and doublecheck things done —26 21 05 —64 18 04 12 
T. Do things slowly to do them right —21 —09 08 —63 16 08 09 
13. Heart pounding or racing —03 20 04 a A A —34 11 
4, Difficulty falling asleep 14. —02 1⁄4 05 Blip 358 17 
15, Restless, disturbêff sleep —20 14 16 —29 76 07 01 
26. Awake early, trouble going back —16 O85: 25 30537 (12: 7 07 06 
to sleep 
52, Sensation of choking or —10 06 02 -09 -09 —61 01 
suffocating 
Si. | © Crying easily =24 34 18 08 07. 53 08 
40, Trembling —30 08 12, -37  —-06 —48 10 
25. Pains in the heart and chest 06 13 05 11 19 —46 32 
24. Pains in stomach —07 13 220 —21 06 —34 70 
46, Backaches or muscular aches 11 00 28 01 o8 —10 65 
48. Nausea or upset stomach —23 —08 o% —21 01 —34 47 
33.0 8.3 4.9 5.5 8.5 6.0 5.7 


Percentage of variance 


Note.—As all factor loadings are less than one, decimal points have been omitted. 


quently, in contrast to ratings by the psychia- 
trists and psychologists, patients were less 
likely to differentiate among these symptoms. 
This was probably due to the fact that many 
depressed patients were experiencing jointly 
Symptoms of anxiety and depression. Evi- 
dence for this view is to be found in a recent 
Study by Overall and Hollister (1965) who 
Isolated three depressive subtypes, anxious 


depressives, hostile depressives, and retarded 
depressives. Fully three-fifths’ of their sample 
fell in the anxious depression subtype. 


Hostility 

A second area of behavior with correlates 
in the different rating instruments was hos- 
tility. However, its form varied somewhat 
from instrument to instrument. The hostility 
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TABLE 3 
Keys ĪTEMS ON THE Srx Factors DERIVED FROM THE Moon SCALES (PATIENT) 
Factor loading 
Item No. Item content 1 2 3 4 5 6 

i = - 23 

19. Active —66 18 14 04 31 
51. Able to work —63 20 03 —37 10 09 
38. Alert —60 —08 09 —06 —05 —13 
41. Rude —02 79 —06 13 06 01 
45. Sarcastic 00 76 =22 10 12 —04 
3. Angry —20 A ran 26 33 02 
42. Sorry for things done 00 13 —76 05 24 08 
47, PETF = 01 22 —76 —03 23 —09 
36. Troubled by conscience —06 06 —70 16 26 —06 
44. itt 12 22 —14 79 12 —15 
40. EN 16 —04 =01 78 16 —16 
2. Tense 06 06 —21 73 26 —27 
33. Restless 16 05 —04 73 25 —02 
50. Lonely —07 09 —08 03 81 —01 
TA: Downhearted 02 06 —19 34 74 —10 
39. Blue 06 07 =17 38 72 -17 
22: Unhappy —06 05 —24 25 69 —39 
14. Carefree 01 00 20 10 —17 78 
25. Cheerful —11 —02 00 —13 —11 77 
26. Satisfied =—32 —16 04 —10 —17 66 
4. Happy 01 23 00 —26 —36 65 
Percentage of variance 15.5 10.0 7.6 12.3 73 6.0 


Note.—As all factor loadings are less than one, decimal points have been omitted, 


items on the psychiatrist IPSC (Factor 2) are 
suggestive of a heightened emotional state 
with uncontrollable temper outbursts and a 
preoccupation and fear of losing control en- 
tirely by smashing things or harming others. 
Hostility on the patient IPSC (Factor 2) was 
more general. The items mentioned above 
had significant loadings on this factor, but 
feeling critical of others and feeling that 
others do not understand or are critical also 
predominated, as did suicidal ideation, that 
is, aggression against the self. As noted pre- 
viously, there was a tendency for factors 
derived from the patient IPSC to be broader 
in scope and less clearly delineated than 
similar factors on the psychiatrist IPSC. 
Forms of indirect hostility, that is, rudeness, 
sarcasm, and a lack of kindness were the 
major elements in the Hostility factor (Fac- 


tor 2) on the mood scales. An easy irritability 
and low frustration tolerance seemed to char- 
acterize the items having the highest loadings 
on WBRS Hostility (Factor 2). 


Feelings of Guilt and Worthlessness 


Factors characterizing feelings of guilt, 
worthlessness, and low self-esteem also ap- 
peared in all instruments. A Loss-of-Esteem 
factor (Factor 6), in which the patient feels 
inferior and worthless and feels that others 
do not like or understand him, appeared in 
the psychiatrist IPSC. Guilt, in the sense of 
blaming oneself for things done or not done, 
and associated thoughts of death or dying 
and thoughts of ending one’s life appeared in 
Factor 3 (Morbid Obsessions) of this same 
instrument. On the patient IPSC, there was 
no comparable Loss-of-Esteem factor. Items 
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TABLE 4 
Key [TEMS ON THE Srx Factors DERIVED FROM THE WARD BEHAVIOR RATING SCALE (Nurse) 


Factor loading 
Item 
No. Item content 1 2 4 4 5 6 
39, Makes small talk 82 —18 13 —02 01 14 
66. Shows pleasure in recreation and I7 —02 —20 02 —03 12 
entertainment 
45. Joins in social games 735 -04 -24 07 03 —05 
124. Likes to be occupied 74 02 04 —03 —29 12 
74. Becomes upset when things do not —02 64 17 24 13 09 
suit him 
te Gets angry when kidded —12 61 16 02 —10 06 
16. Shows irritability or grouchiness —05 61 03 17 06 —01 
23. Shows impatience 17 55 28 31) —11 07 
140. Says he is no good 01 07 17 06 15 —08 
77, Talks about his unworthiness, —06 07 76 06 19 01 
sinfulness 
21. Says he wants to die —09 12 73 —02 07 02 
85, Says people hate him 10 28 70 20 —25 —09 
91, Talks, mutters, mumbles to self 00 08 07 65 —01 —26 
43. Misidentifies persons or things 07 00 04 65 02 —11 
89, Accuses others of wanting to hurt —08 21 20 62 —-22 —16 
him 
10, Listens to and follows directions 30 15 20 —58 -07 —09 
1) Difficulty falling asleep OF 38/16, OAN LEIN 57 —20 
103. Complains of aches and pains 14 22 10 22 54 02 
(hypochondriac) 
68. Expresses concern with bodily 13 16 12 24 61 17 
health 
55. Awakens early, difficulty getting —16 14 08 —06 59) —=16 
back to sleep 
133. Displays an expressionless face Sl 01 —07 —05 02 -79 
112, Speaks in flat, monotonous manner SA E ait 18 o4  —76 
116. Speaks in slow, drawn out manner —27 SE 18 —02. —69 
138. Speaks in sad voice -34 —14 11 02 11 —69 


Percentage of variance 


12.0 5.8 6.2 5.9 4.3 5.9 


Note.—As all factor loadings are less than one, decimal points have been omitted. 


related to the feeling that people do not 
understand or are unsympathetic loaded on 
the Hostility factor (Factor 2). However, a 
Morbid Obsession factor (Factor 3) did 
emerge and is similar in content to the com- 
parable factor on the psychiatrist IPSC. 
Factor 3 of the mood scales, Guilty-Ashamed, 
included elements of both guilt and shame. 
WBRS Factor 3, Feelings of Guilt and 
Worthlessness, contained the major elements 
Previously identified in the other instruments. 


Included in this factor were feelings of worth- 
lessness, not being liked by others, guilt, 
shame, and a preoccupation with suicidal 
ideation. 
Anxiety-Tension 

Factors reflecting anxiety and tension 
emerged quite clearly in three of the four 
rating instruments. These were Factor 4, 


Anxiety-Phobic Reactions on the psychiatrist 
IPSC; Factor 4, Concomitants of Anxiety on, 
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the patient IPSC; and Factor 4, Jittery- 
Nervous on the mood scales. Factor 1, Anx- 
ious Depression of the patient IPSC, also 
included a number of anxiety-related items 
such as “feeling tense or keyed up,” “trouble 
concentrating,” and “feeling fearful.” Taken 
as a group, the key items in these factors 
describe (a) the mood component of anxiety, 
for example, feeling tense, restless, nervous, 
and jittery; (b) the physiological and behav- 
ioral accompaniments of a heightened anxiety 
state, for example, heart pounding or racing, 
faintness or dizziness, having to check and 
doublecheck things done, trouble concen- 
trating; and (c) the fearful, phobic features 
often present in extremely anxious individu- 
als, for example, suddenly scared for no 
apparent reason, having to avoid certain 
things, places, or events because they are 
frightening, and feeling fearful. A separate 
anxiety factor did not emerge from ratings 
on the WBRS. In a prior study, Raskin and 
Clyde (1963) indentified an anxiety factor 
based on WBRS ratings of acute schizo- 
phrenics. The four items comprising this 
factor, “is tense and anxious,” “appears help- 
less or perplexed,” “is fidgety and nervous,” 
and “is hesitant in making up his mind,” 
tended to have moderate loadings, in the .30’s 
and .40’s, on two or more of the six WBRS 
factors from the present study. Three of these 
four items had moderate loadings on WBRS 
Factor 5, Sleep Disturbance, Somatic Pre- 
occupation, 


Sleep Disturbance 


Items describing various forms of sleep 
disturbance emerged as separate factors on 
the psychiatrist IPSC (Factor 5) and patient 
IPSC (Factor 5). The three items common 
to both factors were, “difficulty falling 
asleep,” “sleep that is restless and disturbed,” 
and “awakening in the early hours and then 
having difficulty falling back asleep.” Sleep- 
disturbance items were also contained in 
Factor 5 of the WBRS—Sleep Disturbance, 
Somatic Preoccupation. The presence of sepa- 
rate sleep-disturbance factors which included 
items characterizing both early and late in- 
somnia runs counter to expectations based on 
the neurotic-endogenous distinction in depres- 
sion. In the literature, early insomnia, “dif- 
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ficulty in falling asleep,” is associated with 
neurotic depression, whereas late insomnia, 
“awakening in the early morning and then 
having difficulty falling asleep,” is associated 
with endogenous depression (Kiloh & Garside, 
1963). Early insomnia and late insomnia 
had correlations of 40 (p< .01) and .59 
(p < .01), respectively, on the patient and 
psychiatrist IPSC, These two items were also 
highly correlated on the WBRS (r= .67; 
$ < .01), where ratings were based on the 
nurse’s observations of the patient’s behav- 
ior rather than on patient recall. Conse- 
quently, many depressed patients apparently 
experience a generalized sleep-disturbance 
phenomenon which includes both initial and 
late insomnia. This view is supported by 
Costello and Selby (1965), who also used 
both nurses’ observations and patient reports 
and failed to find differences in the sleep pat- 
terns of neurotic and endogenous depressions. 


Somatic Complaints 


Factors including somatic complaints and 
preoccupations emerged on the patient IPSC 
and WBRS. On the patient IPSC, somatic 
complaints had significant loading on two 
factors—Factor 6, Hysteria, and Factor 7, 
Gastrointestinal and Muscular Complaints. 
Factor 6, Hysteria, included the hypochon- 
driacal complaints of emotionally labile indi- 
viduals, for example, “sensation of choking 
and suffocating” and “pains in the heart and 
chest.” Additional items on this factor were 
“cries easily” and “trembling.” By contrast, 
Factor 7 included “pains in the stomach,” 
“backaches or muscular aches,” and “nausea 
or upset stomach,” symptoms usually associ- 
ated with psychosomatic rather than hypo- 
chondriacal complaints. WBRS Factor 5, 
Sleep Disturbance and Somatic Preoccupa- 
tion, included three sleep-disturbance items as 
well as three items which are most character- 
istic of hypochondriacal behavior, that is, 
“expresses concern with bodily health,” “asks 
for medicine other than that prescribed,” and 
“complains of aches and pains.” 


Retardation in Speech and Behavior 


Grinker et al. (1961) made a distinction 
between deep depression which included emo- 
tional withdrawal with motor retardation and 
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milder forms of depression in which apathy 
and a loss of interest in on-going events 
predominated. Results of the present study 
lend some support to this distinction. Items 
related to motor retardation were included in 
only one instrument in the present study, 
the WBRS. A factor labeled Retardation in 
Speech and Behavior (Factor 6) emerged 
from the WBRS and included, among others, 
the following items: “displays an expression- 
less face,” “speaks in a flat monotonous 
manner,” “appears ‘slow in movements,” 
“speaks in a sad voice,” and “looks tired and 
worn out.” It will be recalled that items 
reflecting the patient’s interest and involve- 
ment in hospital activities emerged as a 
Separate factor (Factor 1, Social Participa- 
tion) on the WBRS. However, there was a 
moderate but significant correlation between 
Factors 1 and 6 (r = —.34; p < .01). 


Conceptual Disorganization 


Thinking disturbance, the hallmark of the 
schizophrenic, was not well represented in 
the present study, and items in this area were 
included primarily on the WBRS. Factor 4 
of the WBRS was labeled Conceptual Dis- 
Organization and included orientation items 
such as “knows who he is,” “knows where he 
is,” and “misidentifies persons or things.” 
Additional items in this factor are, “talks, 
mutters, mumbles to self,” “accuses others of 
wanting to hurt him,” and “has to be 
reminded what to do.” There was a total 
of 11 items in this factor, and this was the 
Second largest factor in number of items and 
Percentage of total item variance accounted 
for on the WBRS. 


Depressive Mood 


Factors related specifically to the mood 
Component of depression appeared only on 
the mood scales. Factor 5, Depressed, con- 
tained 13 significant items and accounted for 
the largest percentage of item variance 
(15.5%) in the mood scales. The four key 
items were “lonely,” “downhearted,” “blue,” 
and “unhappy.” Whereas the items in Fac- 
tor 6, Carefree, would appear to be polar 
Opposites of the items in Factor 5, they 
form a separate factor. Key items in Fac- 
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tor 6, Carefree, are “carefree,” “cheerful,” 
“satisfied,” and “happy.” Apparently what 
this means is that an individual may rate 
himself as not feeling blue or unhappy, but 
this does not necessarily mean he feels 
carefree and happy. 

Although the mood, or affective component 
of depression, did not emerge as a separate 
factor on the other rating instruments, it was 
represented to some extent on at least two 
of the factors previously described. Factor 1, 
Anxious Depression, on the patient IPSC con- 
tained the item “feeling blue.” Factor 3, 
Morbid Obsessions, of the psychiatrist IPSC 
contained the items “feeling blue” and 
“feeling hopeless.” 

In sum, study findings revealed nuances of 
behavior within a category which would have 
been less apparent had we sampled only one 
aspect of patient behavior, such as interview 
or ward behavior. A good illustration of this 
point may be found in the prior discussion 
of the hostility factors from the different 
rating instruments. 

The finding that factors representing the 
mood or affective component of depression 
emerged as clear factors on the mood scales 
but did not emerge as separate factors on 
the other rating instruments highlights an 
additional advantage to the use of different 
rating instruments and different sources of 
information about the patient. Although the 
reasons for this finding are fairly obvious, 
that is, items sampling depressive mood were 
best represented in the mood scales, and the 
patient was the best source of information for 
rating these items, the point is not trivial, 
In the past, investigators have generally 
used a single rating instrument and source 
of information about the patient as a basis 
for identifying factors or dimensions of 
psychopathology. 

Finally, we are not implying that the 
factors that emerged from these analyses 
describe all facets of psychopathology in hos- 
pitalized depressed patients. The raw data 
for this study were objective rating scale 
items sampling mainly the primary and 
secondary symptoms of depression. It is not 
only conceivable, but likely, that new factors 
would have emerged if, for example, our raw 
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data had consisted of dynamic formulations, 
projective test scores, or scores on psycho- 
motor or cognitive tests. Further research is 
obviously needed to clarify the relationships 
between various levels of psychopathological 
behavior. 
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ANXIETY-REDUCING EFFICACY OF DISTRACTION, 


CATHARSIS, AND RATIONALIZATION IN TWO 
PERSONALITY TYPES + 


GERALDINE K. PIORKOWSKI? 
University of Illinois 


Following film arousal of anxiety, 80 high school boys classified as either 
repressors or sensitizers were exposed to 1 of 3 treatments—distraction, ca- 
tharsis, or rationalization—or to a control condition. The research objectives 
were to determine: (a) whether the 3 treatments in question were differen- 
tially effective in reducing anxiety and (b) whether the efficacy of an im- 
posed treatment is partially a function of the S’s characteristic manner of 
reducing anxiety. On all of the anxiety measures with 1 exception, there were 
no significant differential treatment effects nor any significant Treatment X 
Characteristic Defense interaction. On the criterion measure of overall reac- 
tion time to both neutral and film-related words, significant treatment effects 
emerged (p < .05). Here, the 2 confrontation and human interaction groups (ra- 
tionalization and catharsis) manifested less generalized associative disturbance 
than the 2 avoidance groups (distraction and control). 


Psychotherapists of most theoretical orien- 
tations have long regarded catharsis (verbali- 
zation of feelings accompanied by emotional 
abreaction) as the only effective vehicle for 
anxiety reduction. While this traditional view 
is currently being challenged by the behavior 
therapists, notably Wolpe (1958), numerous 
studies have demonstrated the efficacy of 
catharsis as an anxiety-reducing technique. 

As early as 1943, Haggard reported that 
“catharsis-information” was more effective 
than experimental extinction or rest in re- 
ducing autonomic disturbance induced by 
electric shock. The results of Pomeroy 
(1950), Wiener (1955), Cohen, Silverman, and 
Burch (1956), and Levison, Zax and, Cowen 
(1961) are in agreement that catharsis of some 
form is more effective in reducing anxiety than 
“doing nothing,” talking about unrelated ma- 
terial, or repressive techniques. While the 

1This article was adapted from a dissertation 
submitted in partial fulfillment of the requirements 
for the PhD degree at the University of Illinois. 
The author is particularly indebted to Charles W. 
Eriksen, who supervised this research from its incep- 
tion to its completion, and to the staff of the 
Veterans Hospital, Danville, Ilinois, who provided 
the facilities and materials necessary for the com- 
Pletion of this study. This research was also sup- 
Ported in part by Small Grant MH-08262 from the 
National Institute of Mental Health, United States 


Public Health Service. 3 ‘ 
2Now at Illinois State Psychiatric Institute and 
Loyola University, Chicago. 


widely held view that a cathartic technique 
is far superior to an avoidance-repressive 
approach has been upheld fairly consistently, 
no attempt has been made to explore system- 
atically the effects of distraction, catharsis, 
or any other confrontation technique as a 
function of the S’s usual manner of handling 
anxiety. 

In describing an S’s characteristic or usual 
manner of reducing anxiety, the concepts of 
repressor and sensitizer have been used re- 
peatedly in recent years. These two person- 
ality types have been related to differential 
recognition thresholds for emotionally toned 
versus neutral stimulus material (Lazarus, 
Eriksen, & Fonda, 1951), to differential re- 
call of successes versus failures (Eriksen, 
1952), to differential expression of sexuality 
and hostility on a sentence-completion test 
(Lazarus et al., 1951), and to differential 
defense mechanism preferences on the Blacky 
Defense Preference Inquiry (Nelson, 1955). 
Eriksen (1963), in his review of research 
relevant to these two types, supports the 
clinical notions that repressors, like hysterics, 
use avoidance mechanisms in coping with 
anxiety, while sensitizers, like psychasthenic 
neurotics, attempt to reduce anxiety by intel- 
lectually confronting and ruminating about 
the source of conflict. The sensitizer relies on 
the defense mechanisms of intellectualization 
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and rationalization, the repressor on denial 
and repression. 

It would seem that an imposed distractive 
or avoidance technique would be more ef- 
ficacious in reducing anxiety for repressors, 
who are accustomed to avoidance maneuvers 
in coping with anxiety, than it would be for 
sensitizers. The sensitizers, on the other hand, 
should find an imposed confrontation tech- 
nique more consistent with their own habitual 
manipulations and hence more effective in 
reducing anxiety than the repressors. It was 
hypothesized that the efficacy of an imposed 
anxiety-reducing technique is partially a func- 
tion of the S’s characteristic manner of 
reducing anxiety. 

In order to test this hypothesis, three 
anxiety-reducing techniques were chosen for 
investigation: the time-honored cathartic 
technique, with its emphasis on feelings; dis- 
traction, which closely resembles repression; 
and rationalization. The inclusion of two con- 
frontation techniques (catharsis and rationali- 
zation), one feeling-oriented and the other 
rational in emphasis, permitted a comparison 
of the anxiety-reducing efficacy of these two 
components. 

Other questions which this study sought to 
answer were: (a) are distraction, catharsis, 
and rationalization differentially effective in 
temporarily reducing anxiety; and (5) do 
they differentially affect the amount of anx- 
iety experienced at a later time when cues 
from the anxiety-arousing situation are pre- 
sented to rearouse anxiety? 


METHOD 
Subjects 


The Repression-Sensitization scale (R-S; Byrne, 
1961) was administered to all boys at the 11th and 
12th grade levels at a local high school, Boys who 
scored in the upper quartile of the distribution of 
Scores were labeled “sensitizers” and those scoring 
in the lowest quartile were labeled “repressors.” The 
mean of the R-S scores for all boys was 68.66, 
SD = 17.98. The scores of the sensitizers ranged from 
81 to 119 with a mean of 92.76, SD = 8.68, while 
the repressors’ scores ranged from 30 to 54 with a 
mean of 46.78, SD =6.00. The 40 repressors and 40 
sensitizers used in this study were volunteers from 
among the 176 boys designated as either repressors 
or sensitizers. The mean scores as well as SDs of 
these Ss very closely approximated those of the 
larger group from which they were drawn. 


Measures 


Film. The anxiety-arousing stimulus was a 16 mm, 
color film with sound track (approximately 20 
minutes in length), entitled “Death on the High- 
way.”8 It is a provocative commentary on the 
dangers of careless driving and depicts a series of 
mangled, charred bodies. The narrator of the film 
adds to the startling quality of the film by his 
magnification of the incidence of traffic accidents and 
his detailed descriptions of the mangled victims, 
This film has had prior usage and validation of a 
sort as an anxiety-arousing stimulus in Alexander 
and Husek’s study (1962). 

Verbal anxiety measures. Alexander and Husek’s 
anxiety differential (1962) was used ‘to obtain a 
verbal measure of situational anxiety. The anxiety 
differential is a set of seven-point scales, similar 
to Osgood’s semantic differential, which combines 
concepts with seemingly unrelated scale dimensions 
(eg., Dreams: loose-tight; Germs: deep-shallow, 
etc.). 

From the Alexander and Husek scales 33 items 
were selected, among which were 29 scorable and 
4 buffer items, Two different orderings of these 33 
items were randomly devised (Forms A and B). 
Each S completed one of the forms prior to the 
arousal of anxiety and the other following the 
anxiety-reducing manipulation. The ordering of the 
two forms was determined randomly for each S. 

The scores used in the analyses were total scores, 
that is, the sum of the individual scores (1-7) on 
each of the 29 scales, where a score of 7 represents 
the most anxious scale point. 

Skin-conductance measures. Prefilm, film, postfilm, 
and posttreatment conductance means in micro- 
mhos were computed for each S. The prefilm con- 
ductance mean was the average of three measures 
taken at the arbitrarily selected intervals of 1 minute 
25 seconds, 3 minutes 24 seconds, and 4 minutes 49 
Seconds before the beginning of the film. For the 
Purpose of scoring responsivity to the film itself, 
the film was divided into nine events, each con- 
taining a traffic accident victim in varying degrees 
of mutilation. The average of each S’s responses to 
Events 4, 5, and 6 were taken as representative of 
film arousal. The postfilm conductance mean was the 
average of three measures taken at the arbitrary 
intervals of 26 seconds, 1 minute 42 seconds, and 
3 minutes following the end of the film, and the 
Posttreatment mean the average of three measures 
taken at the intervals of 9 seconds, 1 minute, and 
1 minute 50 seconds following the anxiety-reducing 
manipulation, 

Word-association measures, The word-association 
test was composed of 69 words, 14 of which were 
designated “film-related.” These film-related (F-R) 
words were, in general, related to the main theme 
of the movie, for example, accident, careless, high- 
way, speed or to specific details focused on in the 
film, such as fire, leg, hand, windshield, tire. The 


3 This film is distributed by the Suicide’ Club of 
Berkley, Michigan. 
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F-R words were assumed to evoke associations to 
the scenes in “Death on the Highway” and, hence, 
would reflect the degree of anxiety associated with 
the film at the time of the word-association test. 

For each of the 14 F-R words, a neutral word 
was selected which appeared at about the same 
position in the series. In most cases, the matched 
neutral word preceded the F-R word, The scores 
for each S, which were used in the analyses, were 
a reciprocal reaction time (RT) mean for the 14 
neutral words and a reciprocal RT mean for the 
14 F-R words, 

For the purpose of scoring GSR responsivity to 
the words, five pairs of words (cow, accident; 
carpet, highway; farm, ambulance; butter, careless; 
house, car) were selected as representative of the 14 
pairs used in the RT analysis. The difference be- 
tween the lowest skin conductance at the point of 
word presentation and the highest skin conductance 
during the 10-second interval following the presenta- 
tion of a word was regarded as the GSR. Two 
scores were obtained for each S: a mean GSR in 
micromhos for the five neutral words and a mean 
GSR for the five F-R words. 


Procedure 


Prior to his arrival, each S was assigned arbitrarily 
to one of four groups: distraction, catharsis, 
rationalization, or control. An attempt was made to 
keep the Ns in each of the eight cells of the experi- 
mental design (Treatments X Repressors-Sensitizers) 
toughly equivalent at any given time. This consti- 
tuted the only restriction on random selection of Ss. 
Each of the four groups was finally composed of 
10 repressors and 10 sensitizers. The Æ throughout 
the experiment was the author. 

Each S was scheduled individually for a 2-hour 
Session. The experiment was described simply as an 
attempt to measure body reactions to a traffic safety 
film. A Grass, six-channel polygraph was used to 
tecord skin resistance. The electrode paste used with 
the zinc electrodes was prepared according to the 
Instructions set forth in an article by Lykken 
(1959). One of the zinc electrodes was placed near 
the arch on the plantar surface of the left foot and 
the other vertically opposite it on the upper surface 
of the foot. During the 15 minutes of initial record- 
ing, each S filled out one form of the anxiety 
differential, and was then told to look at magazines 
if he chose, while the instrument was being cali- 
brated. The purpose of this initial 15 minutes of 
recording was to obtain a stable prestress measure 
of skin resistance. 

The film instructions were brief. Each S was told 
to pay close attention to the film and to sit quietly 
for a few minutes after the film was over so accu- 
Tate readings could be taken. Each S then watched 
the film “Death on the Highway” while continuous 
skin resistance measures were taken, After the film 
Presentation, each § sat quietly for 3 minutes, and 
then appropriate treatment manipulations were 
begun, 


Distraction condition. Each S was told that in 
order to compare body reactions to the traffic film 
with those to films of a lighter nature, he would 
see two Abbott and Costello films. The two films, 
“Pinch Me Please” and “Riot on Ice,” totaling ap- 
proximately 20 minutes, were then shown to each 
S in this condition. 

Catharsis condition. While there was some stand- 
ardization in terms of areas to be covered, the 
primary emphasis was on encouraging the Ss to talk 
freely about their reactions to the film, The Z’s 
responses were limited essentially to those which 
provided reassurance that their expressed fears and 
concerns were understandable and to interpretations 
and reflections of feelings. Each S was asked the 
following questions, primarily as a vehicle for 
catharsis: (a) How do you feel? (b) What parts 
of the film bothered you most? (c) Do you have 
any idea why? (d) What were some of the things 
you were thinking as you watched the film? (e) Do 
you feel worried about something like that happen- 
ing to you? (f) Why do you think most people 
get pretty upset in watching a film of this kind? 
Among the more standard interpretations given to 
the Ss were the following: 


You know it’s funny but most of us put un- 
pleasant things out of our minds most of the time, 
unpleasant things like death and mangled bodies. 
Then when out of the blue, we’re forced to face 
these unpleasant facts, we’re all the more upset 
because we’re not prepared for them... . You 
know deep down we’re all afraid of being physi- 
cally hurt, losing an arm or leg, afraid of facing 
the unknown in death. I think this kind of film 
sort of arouses these fears we all carry within us. 


The cathartic sessions ranged anywhere from 10 
to 20 minutes, depending on the ease with which 
the S could verbalize his reactions. It might be 
noted at this point that for the overwhelming major- 
ity of Ss, this condition appeared to be clearly 
cathartic. In response to the question “How do you 
feel?” most Ss, with little additional encouragement, 
ventilated, abreacted, and elaborated on their reac- 
tions to the film, When the cathartic session was over 
before the prescribed 20 minutes, the S was told 
he could look at magazines until the physiological 
recordings were “checked again in [X] minutes for 
accuracy.” 

Rationalization condition. Each S was told, “Be- 
fore you spend a great deal of time worrying, let 
me read you a letter written by one of the editors 
of a big Chicago newspaper to the Suicide Club 
which puts out the film you’ve just seen. After I’m 
done with the letter, I’d like your reactions both 
to it and the film.” A few excerpts from this 
“manufactured” letter follow: 


I have just seen your movie “Death on the 
Highway” and while I agree people need to know 
the tragedies that can happen on the highway, 
let’s be honest. The chances of something as tragic 
happening to me or to any other single person 
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as was depicted on your film are very, very small. 
I sometimes think it’s because accidents of the 
kind you showed are so rare that you make so 
much out of them. ...I want you to understand 
that I’m not saying “Drive any old way and you 
won’t die.” One has to observe safety regulations, 
speed limits, and so forth or else one is apt to 
wind up in your movie. But I am saying that 
people make a great deal out of the unusual and 
forget that 994% of all drivers are never involved 
in a serious accident. I’m sure it took you at least 
3-4 years to come up with the number of bloody 
accidents you showed on the film. 


After the letter reading, the S’s rationalizing 
comments (e.g, “It’s not likely to happen to me.”) 
were strongly reinforced. Negative criticisms of the 
letter were countered with comments on the part 
of the EZ such as, “Yes, that’s true, but the fact 
remains that there are a great many drivers who 
are never involved in a serious accident.” The at- 
tempt here was to minimize the impact of the film 
with “probability-like” statements. As with the 
cathartic sessions, these ranged anywhere from 10 
to 20 minutes. The same procedure used in the 
catharsis condition was followed here when the 
session was over before the prescribed 20 minutes, 

Control condition. Each S was told that in order 
to check on the calibration of the machine and get 
some idea of how his body reactions change with 
time, additional recordings would be taken in 20 
minutes. To avoid being unhooked and hooked 
up again, he was asked to relax and look at 
magazines if he chose. The E left the room for the 
entire period of time to avoid being engaged in 
conversation about the film. . 

Following the 20 minutes allotted for anxiety- 
reducing manipulations, skin resistance was recorded 
for 2 minutes. The electrodes were removed, and 
each S filled out an alternate form of the anxiety 
differential. A 15-minute “break,” away from the 
experimental room, then ensued. 

When each S returned to the experimental room, 
he was again hooked up to the polygraph and the 
word-association test was introduced as an attempt 
to measure “how rapidly you can think.” Each S 
was told that the particular word he gave did not 
matter, as long as he responded rapidly. The 69-item 
word-association list was presented after the skin- 
resistance baseline had stabilized. Latency of response 
was recorded to the nearest second. 

The interval of time between words varied, de- 
pending on the length of time the GSR subsided 
or returned to the baseline. However, because the 
baseline kept shifting throughout the word-associa- 
tion procedure, a 20-second maximum between words 
was established, 


RESULTS 
Skin Conductance 


The film was successful in arousing anxiety, 
as measured by increased skin conductance. 
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The differences between the prefilm, film, and 
postfilm conductance means for all Ss were 
significant (p < .001) in the direction which 
is indicative of increased arousal as a func- 
tion of the film experience. The three means 
were 10.2, 11.2, and 12.2, respectively. There 
were no significant repressor-sensitizer effects 
in the 3 X 2 analysis of variance (Type I 
design, Lindquist, 1953) of the prefilm, film, 
and postfilm conductance means. 

While the repressor-sensitizer differences in 
physiological arousal were not significant, the 
repressors had higher skin conductance means 
than the sensitizers at each of the measured 
intervals: prefilm, film, postfilm, and post- 
treatment. To further evaluate the physio- 
logical responses of these two groups, another 
measure of arousal, variablity, was used in 
comparing the two groups. The variance of 
each S’s nine film measures of skin conduct- 
ance was computed and treated as a variabil- 
ity score for each S. The variance of these 
scores for the repressors was significantly 
greater than the variance for the sensitizers 
(F = 3.98, p< .01). In addition, the vari- 
ability score mean for the repressors was 
higher than the mean for the sensitizers, 
though not significantly (p < .12). These re- 
sults tend to suggest that repressors demon- 
strate more variability or lability in physio- 
logical responsivity than sensitizers. 

The results of a 2 X 2 X 4 analysis of vari- 
ance (Type III design, Lindquist, 1953) on 
the postfilm and posttreatment conductance 
means failed to yield any differential effects 
for the four treatments or for the repressor- 
sensitizer variable. Only the difference be- 
tween the postfilm and posttreatment means 
for all Ss was significant (p < .01), indi- 
cating that the Ss were less anxious following 
the treatments than they were immediately 
after the film. The postfilm mean was 12.2 
and the posttreatment mean was 11.5. 
Anxiety Differential 

On the anxiety differential, the Ss mani- 
fested significantly more anxiety on the 
posttest than they had initially (F = 7.98, 
É < .01). In addition, the repressors over- 
all (pre- and posttest scores) manifested 
less anxiety than the sensitizers (F = 13.46, 
$ < .001) and manifested less of an increase 
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in anxiety on the posttest than the sensitizers 
(F= 4.62, p < .05). There were no signifi- 
cant treatment effects in the 2 X 2 X 4 analy- 
sis of variance of the pre- and posttest scores 
of the anxiety differential. 


Word-Association Test 


In the 2 X 2 X 4 analysis of variance of 
the neutral reciprocal RT means and the F-R 
reciprocal RT means, the difference between 
the F-R mean and the neutral mean was 
significant (F = 143.49, p< .001). The Ss 
were significantly slower in responding to 
F-R words than to neutral words (reciprocal 
RT mean of .37 versus .44). 

On this same analysis, there was also a sig- 
nificant treatment effect (F = 3.00, p < .05). 
This significant treatment effect, however, was 
not on the expected differential RT to neutral 
versus F-R words criterion, but on total RT 
scores, which additively combined the F-R 
mean with the neutral mean. On this criterion 
of overall associative responsivity, the ration- 
alization group responded most rapidly, fol- 
lowed by the catharsis, distraction, and con- 
trol groups, in that order. The criterion which 
was expected to reflect the differential efficacy 
of the treatments, namely the FR-Neutral x 
Treatments interaction, did not approach 
significance. 

Table 1 presents the neutral and F-R 
Means for each of the four treatment groups. 
An examination of these means reveals that 
the significant treatment effect reflects differ- 
ential latencies for the four groups over both 
neutral and F-R words. The rationalization 
group responded most rapidly not only to the 
F-R words but to the neutral words as well. 

While the Treatment X Personality inter- 
action in the analysis of variance of the RT 
Scores approached significance (F = 2.47, p 
<.10) and conformed to the expectations of 
this research, the criterion variable here again 
was overall responsivity to both F-R and 
neutral words rather than differential RT. On 
this criterion, the repressors manifested less 
associative disturbance than the sensitizers 
within the distraction condition, while the 
Sensitizers showed less disturbance than the 
Tepressors within both the cathartic and ra- 
tionalization treatments. In the control condi- 
tion, which was essentially an avoidance con- 
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TABLE 1 


RecrprocaL REACTION TME MEANS TO NEUTRAL 
AND FILM-RELATED WORDS 


M 
Treatment N F-R 
Distraction .4228 .3564 
Catharsis 4442 -3750 
Rationalization 4913 4071 
Control -4104 +3438 


Note.—Abbreviated: N = neutral, F-R = film-related. 


dition like distraction, the repressors mani- 
fested less associative disturbance than the 
sensitizers. 

The GSR means to the neutral and F-R 
words tended to support, though not signifi- 
cantly, the treatment effects suggested by the 
RT scores. The differences between the neu- 
tral and F-R GSR means were smaller in the 
two confrontation conditions (catharsis .16; 
rationalization .19) than in the two avoidance 
groups (distraction .26; control .24). In ad- 
dition, the total GSR responsivity (neutral 
mean + F-R mean) was greater in the two 
avoidance groups (distraction 1.94; control 
2.00) than in the two confrontation groups 
(catharsis 198; rationalization 1.71). The 
only significant result on the 2 X 2 X 4 analy- 
sis of variance of the GSR neutral and F-R 
means was the neutral versus F-R difference. 
The mean GSR to the neutral words was sig- 
nificantly lower (p< .001) than the F-R 
mean (.80 versus 1.10). 


Discussion 


That the film “Death on the Highway” was 
successful as an anxiety-arousing stimulus can 
be argued from many different lines of evi- 
dence. Not only did the Ss show significant 
increases in skin conductance during and im- 
mediately following the film when compared 
to their prefilm level, but they were signifi- 
cantly slower in responding to and showed 
significantly greater GSRs to F-R words than 
to neutral words. In addition, the Ss appeared 
significantly more anxious on the anxiety 
differential following the film than they had 
initially. While each of these indexes can be 
questioned individually as valid measures of 
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anxiety, the total weight of evidence lends 
fairly strong support to the contention that 
the film was a successful anxiety-arousing 
stimulus. Most Ss found the film “gruesome,” 
“difficult to watch,” and, at least, a temporary 
inducement “to drive more carefully.” 

While the results of the physiological data 
tended to suggest that the repressors were 
more aroused and physiologically labile during 
the film than the sensitizers, the repressors 
manifested less anxiety (p< .001) on the 
anxiety differential than the sensitizers. This 
kind of discrepancy between physiological 
data and verbal measures of anxiety was also 
observed by Lazarus and Alfert (1964). They 
found that high deniers admitted less evidence 
of disturbed affect while showing greater evi- 
dence of stress reaction on autonomic mea- 
sures than low deniers. Such a discrepancy 
between verbal and physiological measures of 
anxiety has important theoretical implications. 
It implies, for example, that a distinction be- 
tween the S’s perception of anxiety in him- 
self and physiological arousal should be made 
in measuring anxiety. 

With respect to the major considerations of 
this study, it seems clear that the treatments 
were not differentially effective in temporarily 
reducing anxiety. All four groups showed a 
significant and, yet, comparable decrease in 
skin conductance following the treatments, 
when compared to their postfilm level. There 
were also no significant differences among the 
four groups on the anxiety differential mea- 
sure of anxiety. Thus, it appears that in an 
anxiety-arousing situation which is focal and 
primarily external, the dissipation of anxiety 
is primarily a function of time following the 
removal of the anxiety-arousing stimulus. The 
catharsis, rationalization, and distraction con- 
ditions had no greater impact on this gradual 
reduction of autonomic arousal than the con- 
trol condition, which was essentially a tran- 
quil avoidance condition, devoid of human 
interaction. Whether this particular finding is 
limited to the specifics of this study, that is, 
to the particular anxiety-arousing stimulus 
used, to the specific kinds of treatments em- 
ployed, and to the Ss in question can only be 
answered by further experimentation. 

With regard to the question of whether the 
treatments were differentially effective in ex- 
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tinguishing anxiety, that is, in affecting the 
degree of anxiety experienced when the anxi- 
ety-arousing stimuli are again encountered, 
the experimental results were equivocal. There 
were no significant treatment effects on the 
word-association criterion of differential RT to 
neutral versus F-R words. Instead, the treat- 
ment groups differed significantly in terms of 
RT to all words with the two confrontation 
groups (rationalization and catharsis) re- 
sponding more rapidly than the two avoid- 
ance groups (distraction and control). 

The question arises—does this significant 
effect reflect the differential efficacy of the 
treatments, or does it reflect initial group 
differences in associative flexibility and/or 
other related intellectual or personality varia- 
bles? Unfortunately, there is no clear-cut 
method of separating these two possible sets 
of influencing variables. The GSR results to 
the neutral and F-R words tend to suggest 
that the two confrontation groups were less 
anxious than the two avoidance groups. How- 
ever, the GSR differences were not significant, 
thereby weakening the support they provide 
for the differential treatment effect interpre- 
tation of the significant RT results. The sig- 
nificant RT results can only be regarded as 
suggestive, that is, as a source of hypotheses 
for future studies. 

Ordinarily, an adequate control for initial 
differences among Ss in word-association skills 
is provided by a measure of RT to neutral 
words. However, the significant treatment 
group differences in RT to all words in this 
study raises the possibility that RT to neu- 
tral words was also affected by anxiety, and 
that the RT differences among the groups 
were a function of differing degrees of gen- 
eralized associative disturbance. In order to 
take this possibility into consideration, a 
more appropriate control for initial differ- 
ences in word-association skills would have 
been an RT measure obtained under neutral 
conditions, that is, prior to the arousal of 
anxiety. 

The question of whether the efficacy of an 
imposed treatment is partially a function of 
the S’s characteristic manner of reducing anxi- 
ety also cannot be answered unequivocally. 
While the expectations of this research as to 
interaction between treatments and defensive 
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predispositions tended to be supported (p < 
.10), the criterion variable here also was RT 
to all words. On this variable, the repressors 
showed less associative disturbance than the 
sensitizers in the two avoidance conditions, 
while the sensitizers manifested less associa- 
tive disturbance than the repressors in the 
two confrontation conditions. Lazarus and 
Alfert (1964), and Speisman, Lazarus, Mord- 
koff, and Davison (1964) have found that the 
effectiveness of defensive communications ac- 
companying an anxiety-arousing film is par- 
tially a function of their compatibility with 
the defensive predispositions of the Ss. The 
directionality of the results in this study is at 
least consistent with the findings of other 
related studies on this issue. 

Whether the results of this study pertain- 
ing to the central considerations and expecta- 
tions can be generalized beyond the specifics 
of this study cannot be judged at this time. 
Among the questions which need to be an- 
swered by future studies are the following: 
Are repressors and sensitizers at the high 
school level comparable to these groups at 
the college level? Did the fact that the Æ was 
a female interacting with adolescent males 
have any effect on the anxiety and/or treat- 
ment results? Was the apparent inferiority of 
the distraction and control conditions a func- 
tion of their avoidance nature or a function 
of their being conditions devoid of human 
interaction? 

In the midst of many unanswered questions 
it seems clear that the experimental design, in 
general, provides a meaningful and practical 
Paradigm for the investigation of anxiety and 
the efficacy of various anxiety-reducing tech- 
niques. The film “Death on the Highway” 
Proved to be a valid means of arousing anxi- 
ety which affected such diverse measures as 
skin conductance, the anxiety differential, 
GSR and, RT to film-related words. 
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EFFECTIVENESS OF PARENTS OF HEAD START CHILDREN 
AS ADMINISTRATORS OF PSYCHOLOGICAL TESTS * 


MELVIN E. ALLERHAND 
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The purpose of this study was to assess the effectiveness of parents of Head 
Start children as administrators of psychological tests. 7 parents independently 
tested a group of children who were also evaluated by psychology graduate 
students. The individually administered tests involved were the Caldwell Pre- 
School Inventory and the Peabody Picture Vocabulary Test. The correlation 
between the 2 groups of testers was .88 on the Pre-School Inventory and .64 
on the Peabody Picture Vocabulary Test. On the strength of these correlations 
and other methods of examining the data, it is suggested that less sophisticated 
individuals with high motivation may be adequate in performing certain 
professional tasks, such as the administration and scoring of individual tests. 


In very recent years, much consideration 
has been given to the area of training non- 
professionals in tasks currently viewed as be- 
ing in the professional range (Hallowitz & 
Riessman, 1965; Pearl & Riessman, 1965; 
Rioch, Elkes, & Flint, 1965). The end prod- 
ucts of such studies and demonstration proj- 
ects suggest a range of clearly defined, spe- 
cific tasks that, with appropriate training and 
supervision, may very effectively be carried 
out by individuals indigenous to the situation. 

A recent paper by Blum (1965) points to 
the importance of defining service jobs rele- 
vant to the services required, rather than pri- 
marily considering the professional training— 
in this case, of the social worker. He further 
indicates that to accomplish a certain task, it 
may be necessary to have a variety of people 
with a variety of expertise and special emo- 
tional qualifications. Schwartz (1962) has im- 
plemented these ideas in a project with pro- 
fessionals and nonprofessionals within a pub- 
lic assistance agency. Such thoughts and 
efforts come out of the clear recognition that 
there are considerable manpower shortages 
and that there may be particular tasks that 
can be handled more effectively by individuals 
either closer to the actual locus of applica- 
tion of the service or better equipped to carry 
out specific aspects of a larger function. This 
position regarding the poor person has been 

1This work was supported by the Office for 
Economic Opportunity under Contract OEO-512. 
Evelyn Century, research associate, has aided the 


author in the analysis of the data and the prepara- 
tion of this report. 


forcefully expressed by Pearl and Riessman 
(1965). 

There is another argument presented by 
proponents of training poor people to carry 
out services within their own community. 
This is an effort to upgrade the level of func- 
tioning and permit the development of lead- 
ership within the group of people who are 
seeking to gain an economically more desir- 
able position in society. This orientation has 
been central in such programs as Head Start 
—the inclusion of poor people on various 
boards for planning an assault on poverty, 
etc. 

Although we have indications that non- 
professional people can be trained for certain 
professional tasks (Hallowitz & Reissman, 
1965), there is still a general feeling of doubt, 
and, understandably, within professional 
ranks there are rather specific questions as to 
the performance of nonprofessionals in par- 
ticular professional tasks. It is the intent of 
this investigator to: (a) evaluate the poten- 
tialities of previously untrained people for 
filling service roles within their own primary 
groups, (b) locate and define the areas of 
service for which nonprofessional people can 
be trained, (c) investigate whether it is feasi- 
ble to establish a training and supervisory 
program which will effectively produce such 
service workers, and (d) to measure the im- 
pact of such trained people on their families 
and communities. 

If it is determined that people with lim- 
ited formal education could be used in han- 
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dling certain services which previously had 
been performed only by trained professionals, 
a dual benefit could be realized—a benefit to 
those being serviced and to those doing the 
servicing. Underlying this approach is the 
belief that individuals who are part of a group 
may have greater effectiveness in performing 
services within that group. This study is one 
effort in that direction. 

The purpose of this study was to assess the 
effectiveness of parents of Head Start (HS) 
children as administrators of psychological 
tests. Particular consideration was given to 
such factors as identification with the poor 
community and similarity of ethnic and skin- 
color characteristics which might contribute 
to differential test success of the HS children 
in the sample. Katz (1964) referred to the 
impact of such variables as the kind of tester, 
circumstances of the testing situation, etc., on 
the level of functioning, particularly of the 
Negro child. He concluded that Negro chil- 
dren are more vulnerable to stress in a pre- 
dominantly white situation and thus are likely 
to have lowered achievement. Thus, we won- 
dered whether the poor child in an urban cen- 
ter such as Cleveland would respond differ- 
ently to a white sophisticated tester as con- 
trasted with a Negro unsophisticated tester. 

Further, we reflected on the question of 
sophistication as a variable; there are indica- 
tions that unsophisticated testers at times 
enable a subject to achieve a higher score be- 
cause of inadvertent clueing, and, on other oc- 
casions, that the novice tester may cause a 
lowered test score because of inexperience in 
manipulating the test items. 


METHOD 


The study is primarily concerned with comparing 
à group of parents of Head Start children selected to 
administer psychological tests with a group of ex- 
Perienced graduate student testers. 

Seven parents (four Negro and three white)? of 
Head Start children were trained in the administra- 
tion of the Pre-School Inventory Test? (PI) and 
ern 


?The size of the sample only permits for reason- 
able indications, especially in the comparison of 
white and Negro testers. 

8 This test was developed by Bettye M. Caldwell 
for the evaluation of the Head Start Program, sum- 
mer 1965. In the original form, the 161-item test 
Purportedly measured the academic achievement of 
young children as expected by nursery and kinder- 
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the Peabody Picture Vocabulary Test (PPVT). The 
parents were paid $2.00 per hour during their period 
of participation. A counter-balanced order of test ad- 
ministration was established. Thus, there was an 
equal number of children initially tested by the par- 
ents and graduate students. The child was tested on 
two different occasions with a maximum of 5 inter- 
vening days. Fifty-seven Negro Head Start children 
were evaluated during this study. The age range 
was 5 years, 2 months to 6 years, 3 months, with a 
mean age of 5 years, 4 months. 

The seven parent testers were female, ranged in 
formal education from the 9th to 12th grade, and 
were 28-39 years of age. They were selected ran- 
domly from a group of 30 volunteers, 

The three graduate students were female, white, 
had a minimum of 1 year of graduate study in psy- 
chology, including training in test administration, 
and had administered at least 100 PIs and 60 PPVTs. 


Training Procedure 


The parent testers experienced an intensive 3- 
session training in test administration. The first ses- 
sion involved a description of the pilot study and 
an acquaintance with the PI. After a discussion of 
the construction of the test, there was an examina- 
tion of the series of items included. Whenever there 
was some question about the phrasing of the item 
or the particular method of categorizing and scoring, 
this was discussed in some detail. The involvement 
of the parents became increasingly evident, It was 
strongly recommended that the parents try out any 
of the items on available children in the community, 
It was further indicated that during these training 
sessions attempts would be made to point out the 
kinds of errors that unsophisticated testers may make 
in the application of test items. This seemed to set 
the stage for a good-natured and frank exchange on 
known errors which had considerable payoff in sub- 
sequent training sessions, as it became much more 
tolerable to hear criticism. The joys of success and 
accomplishment seemed to increase tolerance for 
such critical exchanges and resulted in an applica- 
tion of the suggestions in the testing approach. 

The second session, which was 3 days later, was 
primarily devoted to discussing the particular ques- 
tions that parents had regarding the PI and further 
centered on the experiences they had had in practice 
administrations. The questions they raised paral- 
leled those previously raised by sophisticated testers 
during an earlier training experience. For example, 
in Item 30 on the PI the child is asked how many 
broken arms he has, and in order to get some type 


garten teachers, Item analysis has suggested the fol- 
lowing categories which the 1966 revision (85 items) 
reflects: personal-social responsiveness, associative 
vocabulary, concept activation-numerical, and con- 
cept activation-sensory (Caldwell & Soule, 1966). 

4 All the children received both administrations of 
the Pre-School Inventory; however, 13 children 
were not given the PPVT by the graduate students 


because of time pressures. 
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of a “none” or “not any” response, the parents sug- 
gested some stimulus phrases for this question which 
they had found effective in gaining the response. One 
of the suggestions was “How many tails do you 
have?” 

The remainder of the second training session in- 
volved role-playing some of the items on the PI and 
the introduction of the administration of the PPVT. 
There was a demonstration of the PPVT followed by 
each of the parent tester’s attempting to administer 
portions of the test. A critical discussion of the 
testing approaches ensued. It was then recommended 
that the parents administer as many PIs and PPVTs 
to community children as they could prior to the 
final training session. 

During the final session, the majority of the time 
was used in discussing particular problems that the 
parents experienced in the administration of both 
tests. Some of the interfering characteristics were 
brought to the attention of the parent testers, For 
example, one of the parents tended to use the testing 
situation as a teaching medium. She persisted in 
probing for the “correct” answer. The group became 
aware of the testing as an assessing technique. In the 
final role-play that concluded the third session, this 
particular parent showed: a decided decrease in the 
probing approach, 


Testing Procedure 


The tests were actually administered in two Head 
Start child-development centers. A member of the 
research team acted as coordinator in making ar- 
rangements for rooms and the selection of the group 
of children to be tested, Except for these adminis- 
trative matters, the parents carried out the entire 
testing procedure, escorting the child from his class- 
room to the testing room and from the testing room 
back to his classroom. As indicated earlier, each 


Metvin E. 


ALLERHAND 


child was independently tested on different occasions 
by both a parent tester and graduate student. 


Method of Analysis 


Similarity of results of parent and graduate stu- 
dent testers was compared by (a) deriving percent- 
ages of the total number of agreements on answers 
to items in the PI by the group of Head Start chil- 
dren, (b) correlating raw scores of the children ob- 
tained for the PI and the PPVT, and (c) obtaining 
significance of difference tests based on the group 
means for the PI and the PPVT. In addition, a sig- 
nificance test was calculated for the test series, com- 
paring the first with the second test administration 
to determine possible discrepancies relating to the 
test-retest process. A more detailed analysis of the 
categories within the PI was made comparing the 
content and concepts involved in the test. 


RESULTS 


Table 1 shows an overall examination of 
the amount of agreement between the parent 
testers and the graduate student testers. The 
total average of agreement was 76% for the 
PI. It is also evident from the table that there 
was a high degree of general consistency 
among the parent testers, as reflected by the 
74-79% range. Further, there is no apparent 
difference between the results obtained by the 
Negro and white parent examiners. Thus, the 
remainder of the analysis treats the parents 
as a group. 

Table 2 shows the degree of relationship 
between the children’s test scores achieved by 
the parent and graduate student testers. The 


TABLE 1 
PERCENTAGE OF AGREEMENT BETWEEN INDIVIDUAL PARENT AND GRADUATE STUDENT TESTERS 
ON THE PI 
Peete Percentage of agreement on individual tests* ate 
tester N agreement 
on individual 
Student tester tests 
Wi 67 92 63 76 .76 80 .68 .78 8 16 
Ws 76.77 86 78.77 65 81 7 a7 
Ns 82.77 85 Led dons 1S O72 80. 8 77 
Ni 82 67 .78 64 85 .79 75 .71 .76 9 -15 
Ws 87 71 82 80 .76 .59 71 7 7S 
Ne 73 80 .76 84 77 78 84 78 8 -19 
Nı 87 67 81 76 79 .74 70 69 62 .72 10 74 
Total 57 16 


Note.—Abbreviated: W = white; N = Negro. 


* Columns 1-3—one student tester for all Ss; Columns 4-10—panel of student testers was used with these Ss. 
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TABLE 2 


RELATIONSHIP BETWEEN COMBINED PARENTS AND 
SOPHISTICATED TESTERS ON PI Aanb PPVT 
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TABLE 3 


DIFFERENCES BETWEEN TEST RESULTS RELATIVE TO 
ORDER OF TESTING BY THE PARENTS 


Test N r Test N D SD 
PE 57 .88** PI 
PPVT 44 .64** Parent (combined, 
si y first & second) 57 .97 1.62 
fe 01. Parents first 25 3.09 2.27% 
Parents second 32 4.12 2.04* 
4 PPVT 
correlation for the PI was .88 and for the Parents (combined, 
PPVT was .64. Both of these relationships first & second) 44 2.1 1.05 
are significant (p < .01). Parents first 25 1.79 1.60 
Table 3 is an attempt to detect whether Parents second 2 232 aes 
the parents were systematically affecting per- ~% ake 


formances upwards or downwards. Testing the 
differences after combining the scores (whether 
the parent administered the test first or sec- 
ond), we find no significant difference between 
the means of parents and students for either 
PI or PPVT. On the PI, significant differ- 
ences were found in favor of the second test- 
ing, whether the second tester was parent or 
student. Either the test-retest condition or the 
passage of 5 days seemed to increase the score 
on this test. However, with the PPVT there 
was no significant change in the score when 
the tests were administered within approxi- 
mately 5 days of one another, regardless of 
who the first tester was. 


Discussion 


Two factors have clearly emerged from this 
Study. First, we have seen a demonstration of 
effective administering of psychological tests 
by the parents. It must be kept in mind that 
the parents were compared with a sophisti- 
Cated group of students who were trained in 
Manipulating test techniques and keenly aware 
of the need for objectivity in testing situa- 
tions. Despite these factors, very significant 
Correlations, particularly on the PI, were ob- 
tained, It should be noted that the test-retest 
reliability coefficients (Dunn, 1965) for the 
PPVT in the age range 4 years, 6 months to 

years are very similar to the intertester co- 
efficient established in this study. Thus, it is 
Suggested that both results demonstrate the 
comparable effectiveness of the parents and 
Students in these test administrations. It must 

© recognized that the parents were only 


performing the testing and scoring functions— 
not the interpretation of the results. 

Second, there was evidence of the efficiency 
of the parent group, as shown by their high 
level of agreement with graduate students on 
the PI (74-79%). Such effectiveness was 
present considering both educational back- 
ground and skin-color variables within the 
limits of the sample. The fact that there was a 
significant difference resulting from the order 
in which the PI was given, regardless of 
which group tested last, suggests even greater 
similarity in effectiveness between the stu- 
dents and parents and demonstrates that the 
children learned to be more accurate through 
either the passage of time and/or more ex- 
posure to the test items. The PPVT scores 
evidently were not so affected. 

In summary, then, the effectiveness re- 
vealed by the parents in this study supports 
the findings of Reiff and Riessman (1964) 
that there is a potential corps of untrained 
people who may be used for services requiring 
some areas of testing skill. Highly motivated 
individuals indigenous to the particular set- 
ting may very well provide the traits needed 
for negating the handicaps inherent in lack of 
professional training. Their motivation may 
well be related to the special recognition asso- 
ciated with these professional tasks. In fact, 
one mother referred to a neighbor’s comments: 
“Who do you think you are with your tests, 
a psychologist or something?” The mother re- 
ported with obvious pride, “I’m learning how 
to do something—something important.” 
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Additional training of the parents for 
building-in objectivity and test manipulative 
skill is indicated and would help to reduce 
some of the differences found in this study. It 
is likely that the qualities demonstrated by 
the parents as testers may also be used in 
other aspects of professional service, such as 
observation and handling of data. Each utili- 
zation is being examined in an ongoing study 
by the author. 

Beyond the testing-research fields, Pearl 
and Riessman (1965) have studied and sug- 
gested a wide range of “new careers for the 
poor,” including direct care services, counsel- 
ing, teacher assisting, etc. It is essential that 
careful study associated with immediate social 
needs guide us in the further development of 
the handling (under appropriate control) of 
professional tasks by people with limited for- 
mal education. No doubt this approach raises 
some dangers. The necessary selection, train- 
ing, and continued supervision must be as- 
sumed by the broadly educated and trained 
professionals. There may well be a necessity 
for reexamination of the graduate prepara- 
tion of psychologists, with a view toward such 
evolving manpower solutions. Some examina- 
tion has been given to this direction (Aller- 
hand, 1965). No doubt much more is re- 
quired. 


5 A mimeographed copy of this paper may be ob- 
tained from the author. 
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IMMEDIATE SELF-IMAGE CONFRONTATION AND 
CHANGES IN SELF-CONCEPT * 


HARRY S. BOYD anp VERNON V. SISNEY 
Veterans Administration Hospital, Oklahoma City 


Changes in self-concept and concepts of interpersonal behavior of inpatients 


on a psychiatric ward were measured by Leary’s Interpersonal Check List 
following self-image confrontation via video tape, and compared with a con- 
trol group which was not given the self-image confrontation. Hypotheses 
regarding directions and kinds of change were developed and were supported by 
the experimental results, Interpersonal concepts of the self, the ideal self, and 
the public self became less pathological and less discrepant with one another 
following the self-image confrontation, and differences between experimental 
and control groups remained significant 2 wk. later, with 1 exception. 


Most of the studies in the relatively new 
field of self-image confrontation have been 
conducted since 1960 and, with few excep- 
tions, have been primarily of the exploratory 
type. As with many early and exploratory 
studies, control groups have largely been in- 
adequate or nonexistent. However, most 
authors have reported that the technique of 
confronting a patient or experimental sub- 
ject with his own image or behavior has pro- 
duced marked changes in behavior (Corneli- 
son & Arsenian, 1960; Miller, 1962; Walz & 
Johnston, 1963; Ward & Bendak, 1964). 

Other studies have measured the effects of 
self-image confrontation on a number of physi- 
ological variables, such as GSR (Dickinson & 
Ray, 1965) or heart rate (Murray, 1963; 
Verwoerdt, Nowlin, & Agnello, 1965). One of 
the few studies which measured behavioral 
changes and which had reasonably adequate 
Controls was performed by Moore, Chernell, 
and West (1965). The authors used video- 
taping techniques and immediate playback of 
Interviews to subjects (Ss). They reported 
that their findings indicated significant posi- 
tive behavioral changes in psychiatric inpa- 
tients, but did not claim to have controlled all 
relevant variables. 


METHOD 
Hypotheses 


There is much theoretical and experimental sup- 
Port for the hypothesis that the degree to which the 


This study was supported in part by the Vet- 
erans Administration. Video-taping equipment was 
furnished through the courtesy of Video Electronic 
Systems, Tulsa, Oklahoma. 


self is misperceived is highly correlated with behav- 
ioral or psychiatric disorder (Rogers, 1951). A num- 
ber of studies using the Machover Draw-A-Person 
technique (reviewed in Machover, 1951; Swenson, 
1957) and the Rorschach (Fisher & Cleveland, 1958) 
support the hypothesis that hospitalized schizophrenic 
or neurotic inpatients have markedly disturbed body 
concepts (Montague, 1951), particularly in interper- 
sonal contexts (Sullivan, 1954). Cognitive dissonance 
theory (Festinger, 1957) would predict that when 
schizophrenic or neurotic subjects are confronted 
with an accurate and realistic recording of their own 
behavior (such as video-tape recording) there should 
be a dissonance set up between their distorted self- 
image and the more accurate one, and thus that 
there should be a shift of the S’s self-image in the 
direction of increased reality, or with less likelihood, 
a distortion of the perceived recording. Thus, it is 
hypothesized that shifts in the self-concept of the S 
will occur with some frequency following the self- 
image confrontation. 

Hypothesis 1, The Leary Interpersonal Check List 
(ICL; Leary, 1956) provides a technique by which S 
can describe his self-concept by means of an adjec- 
tive checklist containing descriptions of various 
kinds of interpersonal attitudes and behavior. Since 
the theoretical position presented here is that one’s 
self-concept would shift in the direction of greater 
appropriateness and/or accuracy and lesser distor- 
tion following self-image confrontation, it would be 
predicted that scores reflecting self-concept on the 
Leary ICL would shift in the direction of lesser 
pathology as defined by Leary. 

Hypothesis 2. It is hypothesized that the emo- 
tionally disturbed patient has, at best, distorted and 
inadequate anchorage points by which to develop an 
accurate and appropriate self-concept or a realistic 
ideal. In addition, it would be expected that his 
perception of his “public face” or persona would be 
equally distorted by his inability to accurately per- 
ceive the interpersonal behaviors toward himself that 
provide the basis for accurate expectations from 
others. Thus his concept of himself, his ideal self, 
and himself-as-others-see-him or public self are 
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likely to be widely discrepant from one another. 
Therefore, it would be further hypothesized that 
following a self-image confrontation these three as- 
pects of the self would become less discrepant from 
one another, that is, would approach one another 
more closely. 

Hypothesis 3. The self/self-as-others-see-him (pub- 
lic self) distance should decrease, since these aspects 
of the self are more directly involved by the experi- 
ence of seeing oneself from the “outside.” 


Experimental Design 


The Ss were selected from male inpatients on the 
neuropsychiatric wards at a Veterans Administration 
Hospital. All patients who were present on the wards 
formed the pool from which Ss were drawn. The 
only exclusions were made on the basis of the pres- 
ence of mental deficiency, diagnosed or suspected 
neurological disorder, or addiction to alcohol, drugs, 
or barbiturates. Other patients were excluded from 
the sample because they did not remain in the hos- 
pital for the 3 weeks that the experiment required. 
The 14 patients remaining were then randomly as- 
signed to the experimental and control groups, and 
the specific assignment was known only to the ex- 
perimenters, in order to avoid biasing the attitudes 
of the ward personnel. The average age for the ex- 
perimental group was 37.6 years and 39.6 years for 
the control group. In terms of diagnostic categories 
the experimental group contained the following: 
schizophrenia (various subgroups), 3; personality or 
character disorders (various types), 3; and schizo- 
phrenic personality disorder, 1. The control group 
contained: schizophrenia (various subgroups), 3; 
personality or character disorder (various types), 2; 
depression, 1; and agitated depression, 1. Thus the 
groups were quite similar in their average age and 
diagnostic composition. Several days before the self- 
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Fic. 1. Sample graphing of an S’s scores on the 
Leary ICL. 
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image confrontation section of the experiment, Ss 
were administered the ICL by the experimenters in 
groups containing Ss from both experimental and 
control groups. On this administration they were 
asked to rate first themselves, then the ideal self, and 
finally the way in which they believed they were 
seen by other people. On subsequent administrations 
the Ss were given the Leary ICL sheet without 
further explanation, unless they requested clarifica- 
tion of procedure. There is no evidence to suggest 
that group administration of the ICL differs signifi- 
cantly from individual administration, but in any 
case the conditions for the two groups were identical, 
and thus should not contribute to differences in re- 
sults between the experimental and control groups. 

Each S$ was individually brought into a room 
containing the camera and recording equipment, 
which was not concealed. The S was given a stand- 
ardized interview which was designed to elicit a 
relatively high level of improvement and which lasted 
approximately 10 minutes. A large monitor receiver 
was present at one side of the S, but was not turned 
on until after the interview. The interview covered 
the S’s reactions and feelings concerning the other 
patients on the ward, himself, his family, the 
experiment in progress, and the experimenters. Since 
the experimenters were aware of the membership of 
each S in either the experimental or control groups, 
it is conceivable that their behavior might have 
been biased, but it is hoped that the use of the 
standardized interview minimized any such biasing. 

Following the interview, all Ss were then in- 
structed to watch the monitor near them. The 
interview was immediately replayed for the experi- 
mental group. The control group was shown a 10- 
minute taped segment of a daytime television 
comedy. After giving each S an opportunity to ask 
questions or talk about how he felt, he was cau- 
tioned not to discuss the experiment with other 
patients for 2 days. The Leary ICL was then given 
in a different room for the second time. Two weeks 
later each S$ was given the Leary ICL for the 
third time. 


RESULTS 


The scores on the Leary ICL were con- 
verted to the Dominance-Love coordinate 
points and were graphed as shown in the 
example, Figure 1. From the graphed data 
two sets of derived data were obtained, 
pathology-change scores and perimeter-change 
scores. 

Hypothesis 1 (pathology change). Distance 
from the center of the Leary coordinate sys- 
tem is measured by standard scores and re- 
flects the normal distribution of scores, from 
most common (in the center) to most extreme 
or pathological concepts toward the edge. 

For each S, the three self-ratings were 
plotted, the first representing self-concept 
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prior to the experimental manipulation, the 
second reflecting the self-concept immedi- 
ately following the self-image confrontation, 
and the third representing the self-concept 2 
weeks subsequent to the self-image confronta- 
tion. The first self-rating was compared with 
the second and then with the third. If the 
self-concept summary point moved closer to 
the center, it was scored as less pathological; 
if it moved further from the center, it was 
scored as more pathological. The same pro- 
cedure was followed for the ideal self-concept 
and for the public self-concept. Scores were 
then summed across concepts and an overall 
rating of pathology change obtained. The 
number of changes reflecting decreased pa- 
thology and disregarding degree of change 
was summed for each group. Group dif- 
ferences were significant in the predicted 
direction (x? = 4.67, p < .05). 

The original self-concept, ideal self-concept, 
and public self-concept were compared with 
the Leary concepts obtained 2 weeks later. 
The results significantly discriminated be- 
tween groups in the predicted direction 
(x?= 7.14, p< .01). Hypothesis 1 was 
therefore supported. 

Hypothesis 2 (perimeter change). For each 
S, the sets of coordinate points representing 
the S’s concepts of self, ideal self, and public 
self on each of the three occasions the ICL 
was administered were plotted. The perimeter 
of the triangle thus formed was measured in 
Standard score units. This measure can be 
considered a function of the discrepancy be- 
tween self-concept, ideal self-concept and 
Public self-concept. The two perimeter mea- 
Sures following the self-image confrontation 
were compared with the perimeter measure 
obtained prior to the experimental manipula- 
tion. A perimeter decrease would indicate that 
the summary points reflecting the three con- 
cepts as measured by the Leary ICL had 
Moved closer together, as was hypothesized 
for the experimental group. Since the as- 
Sumption of ordinal measurement at least 
Could be met by this measurement technique, 
the scores were analyzed by the Mann- 
Whitney U Test (Siegel, 1956). 

When the original perimeter score was com- 
Pared with the perimeter score obtained im- 
mediately following the self-image confronta- 
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tion, the U score obtained between the 
experimental group and the control group was 
significant (U = 10, p < .036). When the 
original perimeter score for each group was 
compared to the score obtained 2 weeks fol- 
lowing the experiment, the group differences 
were less, enough only to suggest a trend 
(U= 14, p<.104). Thus Hypothesis 2 
received some support. 

Hypothesis 3 (public self and private self). 
It was hypothesized that scores based on 
self-concept and scores based on S’s concept 
of his public self should move closer together 
following the S’s exposure to his public self 
through the self-image confrontation, The 
distance in standard score units between the 
self summary point and public self summary 
point on the Leary grid was measured for 
each S and compared with distance obtained 
for the same S on subsequent administrations 
of the Leary ICL. 

The scores were sorted into high- and low- 
change categories, half going into each cate- 
gory. A comparison of the experimental and 
control group discrepancy-change scores im- 
mediately following the self-image confronta- 
tion suggested a trend in the predicted direc- 
tion (x? = 2.8, p < .10). A comparison of the 
discrepancy-change scores for the two groups 
after a 2-week period showed significant 
changes in the predicted direction (x? = 5.6, 
p < .02). Hypothesis 3 received some support. 


Discussion 


The three hypotheses tested were sup- 
ported, and it seems likely, at least, that the 
experience of immediate self-image confronta- 
tion produces some changes in the self- 
concepts of inpatients on a neuropsychiatric 
ward. Specifically, after only one exposure the 
pathology level of the experimental group 
became less extreme, while in the control 
group the pathology level remained the same 
or became more extreme, and these results 
remained over at least a 2-week period. More- 
over, the various concepts of self, including 
the ideal self and the public self, moved closer 
together for the experimental group than they 
did for the control group, although the control 
group also showed some shift in the predicted 
direction. Finally, the S’s public self-concept 
and the S’s own concept of himself moved 
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closer together in the experimental group. It 
must be remembered that the meaning of 
these findings is to some degree dependent 
upon the validity of the Leary ICL. It would 
therefore be appropriate to evaluate the ef- 
fectiveness of the self-image confrontation 
technique by other measures. 

The authors do not regard the current 
findings as a test of cognitive dissonance 
theory. Rather, cognitive dissonance theory 
and Sullivanian theory form a convenient 
conceptual frame of reference within which 
to discuss the effects of self-image confronta- 
tion. Certainly if one were to set out to test 
the theoretical position used in this research, 
more crucial tests should be devised and a 
greater V evaluated. However, the present 
study is not a test of theory, but of the 
meaning and, ultimately, of the usefulness of 
self-image confrontation as a therapeutic 
technique and, as such, strongly suggests that 
self-image confrontation does have a measur- 
able and presumably positive effect on some 
aspect of the experience of emotionally 
disturbed inpatients. 

Many theorists, particularly in the modern 
versions of psychoanalytic theory, have sug- 
gested that perhaps one of the aspects of 
psychotherapy that is useful to the patient is 
to see himself as others see him, without 
praise or blame. Particularly, Patients who 
have an especially bizarre or socially inap- 
propriate manner of relating to others might 
be expected to benefit from a chance to 
observe their own behavior from “without.” 
Further experiments are planned to determine 
whether such patients would not indeed derive 
more benefit from such an experience or ex- 
periences than would “neurotic” outpatients, 
who presumably have a more adequate con- 
Cept of the appropriateness of their own 
behavior. 

An alternative explanation for our findings 
involves the Possibility of group bias. Al- 
though Ss were assigned to groups randomly, 
when the dropouts were excluded, the experi- 
mental group had a majority of Ss from a 
different ward than the control group. While 
in theory both wards are identical and pa- 
tients are assigned to wards in rotation, it is 
of course possible that these findings might 
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be contaminated by this uncontrolled vari- 
able. Certainly further experimentation is 
warranted in which this factor is controlled, 
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BASIC ELEMENTS IN THE PROCESS OF PSYCHOTHERAPY: 


A RESEARCH STUDY* 
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The study investigated the hypothesis, based on client-centered theory, that 
movement in the client’s process behavior (manner of problem expression, 
depth of personal exploration, and manner of relating to the therapist) 
is positively related to the level of the therapist conditions (congruence, 
empathic understanding, and positive regard) and to case outcome (pre- 
to posttest changes). Ss were 15 hospitalized schizophrenic therapy cases. 
Therapist conditions and patient process variables were rated from tape- 
recorded segments of therapy interviews. It was found that patient process 
movement over therapy was not related to level of therapist conditions nor 
to case outcome. Level of therapist conditions and level of patient process 
behavior were positively related to case outcome and to the perception of 
therapist conditions by the participants. Specific levels of therapist and patient 
behavior were associated with successful outcome. 


Recent reviews of research in psychotherapy 
and counseling attest to the increasing number 
of factors found to be important for the inter- 
action and outcome of therapy (Dittmann, 
1966; Patterson, 1966). These factors cover 
a wide range of therapist and patient vari- 
ables. There is, however, a distinct absence 
of research which attempts to integrate thera- 
pist and patient factors. One difficulty is the 
number of variables that are involved. Even 
large studies often can look at only a few 
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The theoretical framework, hypotheses, and instru- 
Mentation of the study are inseparable from the 
major program carried out by the Psychotherapy 
Research Group staff at the University of Wisconsin 
under the direction of Carl R. Rogers. Test data 
and tape recordings used in the study were collected 
by the Research Group. 

2The writer wishes to express his gratitude to 
Edward Williams for the difficult task of obtaining 
Segments and recording test data; to Edgar Anderson 
for statistical analysis; and to Emily Early, Shirley 
Epstein, Judy Le Roy, Ann Pfeffer, Carol Raff, 
Gene Ridberg, Diane States, and George Talbot, who 
served as raters. Charles B., Truax contributed to the 
initial design of the study. 


aspects of the therapy process. Another major 
obstacle is the scarcity of theoretical state- 
ments that objectively specify the elements 
in therapy and in the process of personality 
change. 

In several papers Rogers has put forward 
an explicit statement of the essential ele- 
ments in the process of therapy. In the most 
comprehensive of these statements (Rogers, 
1959b), the essential conditions for therapy 
were stated in the form of an if-then hypothe- 
sis: If certain therapist and client conditions 
are present, then predictable changes take 
place in the client. The essential therapist 
conditions are that the therapist be congruent 
or genuine in the relationship, that he fully 
accept or maintain unconditional positive 
regard for the client, and that he empathically 
understand the present experience of the 
client from the client’s frame of reference. 
The only conditions necessary for the client 
are that he be vulnerable or anxious and 
that he perceive to a minimal degree the 
therapist’s congruence, regard, and empathy 
(Rogers, 1959b). The change process that 
takes place in the client under these condi- 
tions was subsequently formalized as a scale 
of process in psychotherapy (Rogers, 1959a; 
Walker, Rablen, & Rogers, 1960). It is 
composed of several subscales along which 
the client moves from a fixed, rigid, and 
static mode of experiencing to a flexible, 
meaningful, and flowing mode. 
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These theoretical developments and some 
promising research results (Barrett-Lennard, 
1962; Halkides, 1958) led to a formulation 
of a research design. The design entailed the 
measurement of the conditions provided by 
the therapist in the interview, their perception 
by the patient, changes on Process variables 
by the patient in the interview, and therapy 
outcome in terms of changes on personality 
test measures and hospitalization status for 
therapy and control groups of hospitalized 
schizophrenic patients. The research program 
and data collection are described in Rogers 
(1967). The present report describes one 
study in the program of the main treatment 
group. The basic hypotheses of the study are 
that there is greater positive change in the 
Process behavior of the patient when thera- 
pist conditions are higher; that therapy out- 
come, in terms of change on personality test 
Measures, is more successful when therapist 
conditions are higher; and that both process 
changes and therapy outcome are positively 
related to the patient’s perception of therapist 
conditions, 


METHOD 
Subjects 


The sample of the study consisted of 15 indi- 
vidual therapy cases, Of these cases 9 were shorter 
Ones (less than 55 interviews) that had terminated 
at the time of data collection, and 6 were longer 
ones (more than 85 interviews) still continuing at 
that time. Mean case length for the sample was 74 
interviews and 14 months. 

Ten therapists participated, five of whom saw 
two cases apiece, Therapists varied considerably in 
range of experience and in orientation, though the 
predominant approach can be characterized as client 
centered, Patients were, with some exceptions, seen 
for therapy twice a week for an hour, The interviews 
were tape-recorded, 


35), chronicity (under or over 8 months of hospitali- 
zation), and socioeducational status (above or below 


patients could not be 
assured initially that they would receive individual 
were many motivated for 
therapy. However, any attempt to engage the patient 
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in therapy depended upon prior completion of a 
battery of tests. 


Design and Measures 


The design employed a multiple-approach method- 
ology to the study of therapy. Measures were 
obtained from three sources of data: personality 
tests, interview interaction, and the perception of 
the therapist by the patient and therapist. 

Case-outcome measures. The test battery from 
which the outcome measures were obtained con- 
sisted primarily of the Rorschach, a 10-card TAT, 
the MMPI, a self-concept Q sort, the Wechsler Adult 
Intelligence Scale, and a 246-item anxiety scale 
(cf. Rogers, 1967). It was given prior to therapy, 
at 6 months, and then yearly or at termination, 
Case-outcome measures were based on change from 
the initial battery to the latest one. 

The main estimate of case outcome, the combined 
outcome score, was the mean score for five change 
Measures, using standard scores based on 30 therapy 
and control cases in the research program. The five 
measures were: a clinical estimate of change over 
therapy by two clinical psychologists, based on 
inspection of the test material in the early and late 
test batteries; change in the amount of agreement 
between the patient’s self-concept and a professional 
concept of the ideal person, using the self-concept Q 
sort; change on the anxiety scale; change in the 
direction of greater adjustment on MMPI items 
describing present functioning; and percentage of 
time of hospitalization (reversed) since entry into 
the research. This score provides a balanced com- 
Posite of measures based on projective test material, 
self-report instruments, and an ecological variable. 

Also used as case-outcome measures were an 
MMPI change Score, which is the change (early 
minus late) in the total number of items away from 
a T score of 50 (the “normal” point) on the nine 
clinical scales of the MMPI profile, and the cli- 
nicians’ estimates of change (CLIN rating) referred 
to above. The latter were ratings from 1, extreme 
deterioration, to 9, extreme improvement, judged 
on the basis of the early and late test batteries for 
each case. The MMPI change score and CLIN 
rating, while not as comprehensive as the combined 
outcome score, were used to provide specific informa- 
tion regarding a strictly empirical basis for outcome, 
in the case of MMPI change, and a purely clinical 
estimate, in the case of the CLIN ratings. 

As would be expected, the three outcome measures 
were strongly correlated with each other, with 7's 
of .84 between combined outcome score and MMPI 
change, .80 between combined outcome score and 
CLIN rating, and 68 between MMPI change and 
CLIN rating. The intercorrelations of the five indi- 
vidual measures making up the combined outcome 
Score ranged from —.13 to .80, with a median r of 
39. The five individual measures correlated highly 
with the composite (from .58 to .84). The two 
measures not based on self-reports, clinical inferences 
from test data and percentage of time out of hos- 
Pital, were highly correlated (r=.80), but were only 
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slightly related to the three measures of change 
based on self-report instruments. The three self- 
report measures were moderately related to each 
other (.55-.79). The presence of both related and 
unrelated measures enhanced the validity of the com- 
bined outcome score (cf. Guilford, 1954, p. 393). 

For the CLIN rating, fairly good agreement was 
reported between the two clinicians who estimated 
change (82% agreement within 1 point on the scale). 
While the reliability of the MMPI change score 
could not be estimated, the score uses all of the 
clinical scales of the MMPI profile and is, therefore, 
based on a large number of item scores. Also there 
is substantial agreement between it and the other 
two change measures, 

Perceived therapist conditions measure. The Thera- 
pist Relationship Inventory (Barrett-Lennard, 1962) 
was used for the measurement of perceived therapist 
conditions, It consists of 72 items, 18 for each of 
4 therapist conditions: therapist congruence, em- 
pathic understanding of the patient, positive regard 
for the patient, unconditionality of regard for the 
patient, and total score. In addition to using the 
relationship inventory for the patient, a parallel 
form was used for the therapist on which he rated 
his own perception of the conditions he provided. 
The relationship inventory was administered at the 
third month of therapy and every subsequent 3 
months, Only the first inventories were used in the 
present study, For three patients who did not fill 
out a 3-month inventory, the 6-month ones were 
used instead. 

Patient and therapist interview measures. The 
Process behavior of the patient was measured on 
three scales—problem expression, intrapersonal ex- 
ploration, and manner of relating—which are re- 
visions of several of the subscales of the Psycho- 
therapy Process Scale discussed above. They are 
multistep scales with a description of several sen- 
tences at each step. The Problem Expression Scale 
refers to the recognition of and concern with per- 
sonal aspects of problem situations. It ranges from 
No recognition of problems or difficulties at the low 
end to an ongoing resolution of problems in terms 
of changes in the person’s experience at the high 
end. The Intrapersonal Exploration Scale ranges from 
complete absence of personally relevant material to 
an active and deep exploring of self. The Manner of 
Relating Scale concerns the overt or implied qualities 
of closeness in the relationship and varies from 
rejection of the therapist to the full acceptance 
of a personal relationship. 

The therapist condition scales—congruence, accu- 
Tate empathy, and positive regard—are derived from 
Roger’s (1957) statement of the necessary and suf- 
ficient conditions for personality change, and from 
scales first constructed by Halkides (1958) to test 
this theory. The Congruence Scale refers to the 
therapist’s awareness and integrated expression of his 
experience of the client and of himself. At one 
end he presents a facade, while at the other he is 
fully himself in response to the client, with complete 
Moment to moment integration in the relationship. 
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The Accurate Empathy Scale ranges from being un- 
aware of even the most conspicuous of the client’s 
feelings to an accurate response to the full range of 
expressed and implied feelings in the client’s state- 
ment. The Positive Regard Scale concerns an out- 
going and nonpossessive attitude of caring for, of 
prizing, the client. At the low end the therapist is 
inattentive, disinterested, cold, while at the high end 
he shows a deep caring and warmth for the client.? 
Reliabilities of level and movement scores are 
presented below. 

Rating procedure. To make the rating task man- 
ageable five interviews were selected from each case, 
one at each quarter of case length—the beginning, 
the 25% point, the 50% point, the 75% point, and 
the end or latest point. For example, for a 40- 
interview case, the first, tenth, twentieth, thirtieth, 
and fourtieth interviews would be used. The selec- 
tion of interviews at these intervals provided mea- 
sures of behavior throughout the course of therapy. 
Linear movement by the patient was measured by 
means of the slope, an estimate that made use of 
all the interview points. The level of therapist condi- 
tions was obtained for each therapy point as well 
as over all points, 

The rating material consisted of 4-minute segments 
which were randomly selected from the first third 
and last third of each interview. Where an inter- 
view did not provide sufficient verbal interaction 
(two separate statements by each participant for 
each segment), the closest adequate one was substi- 
tuted. The sampling procedure provided two seg- 
ments from each of five interviews for each 
case: 150 segments in all. Raters received intensive 
training on samples of different patients and thera- 
pists. The order of rating the segments was random- 
ized, with every rater rating the segments in a 
different: order. Raters made their ratings completely 
independent of one another, and different raters 
were used for the patient and therapist scales. 
Segments were edited to delete names and informa- 
tion regarding stage of therapy. 

The number of cases available for each measure 
was 15 for all of the rating scales and the Therapist 
Relationship Inventory; 14 for the combined out- 
come score, CLIN rating, and Patient Relationship 
Inventory; and 13 for MMPI change. 


8 The outcome measures and behavior scales were 
contributed by various members of the research 
group. Appropriate credit for each is as follows: 
combined outcome score by C. B. Truax; CLIN 
rating by C. B. Truax, J. Liccione, and M. Rosen- 
berg; MMPI change by F. van der Veen; Problem 
Expression Scale by F. van der Veen and T. M. 
Tomlinson; Intrapersonal Exploration Scale by C. 
B, Truax; Relationship Scale by E. T. Gendlin and 
M. Geist; Congruence Scale by Je Ti Hart} Jr; 
Accurate Empathy Scale by C. B. Truax; Positive 
Regard Scale by J. Spotts and W. P. Wharton. The 
construction of the combined outcome score is 
described in Truax (1962). The clinician’s ratings 
are reported in Truax, Liccione, and Rosenberg 


(1962). 
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Specific hypotheses. The specific hypotheses of 
the study were: 

1, Movement in patient process behavior is posi- 
tively related to: (a) level of therapist condi- 
tions, (b) perception of therapist conditions by the 
patient, (c) case outcome. 

2. Case outcome is positively related to: (a) level 
of therapist conditions, (b) perception of therapist 
conditions by the patient. 


RESULTS AND Discussion 


Ratings on the two segments from each 
interview were averaged to obtain an inter- 
view score on each scale. Case means are 
based on the five interviews for each case, 
The value for the slope is the linear trend 
component, which is the sum of the products 
of each of the five interview points multiplied 
by its linear orthogonal coefficient. The linear 
coefficients are —2, —1, 0, 1, and 2 for the 
five interview points (Edwards, 1960, p. 
239). The trend component represents the 
degree of linear movement over therapy by 
the patient, upward for a positive slope and 
downward for a negative one. 

Case means on both therapist and patient 
scales tended to be distributed around the 
midpoints of the scales, with SDs ranging 
from .5 to 1.4 stages. The case slopes for 
the patient scales averaged close to zero, with 
SDs ranging from 9 to 1.7 stages. Approxi- 
mately as many cases had negative as positive 
slopes over therapy. 

Interrater reliabilities for the 75 interview 
scores were .44 for problem expression, .42 
for manner of relating, .58 for intrapersonal 
exploration, and .57 for positive regard, 
While these correlations are highly significant 
(beyond the .001 level), they also show con- 
siderable differences in the application of the 
scales. Averaged ratings for the rater pairs 
were therefore used in the analyses. By means 
of the Spearman-Brown formula as suggested 
by Guilford (1954, p. 397), the reliability 
estimates of these averages are .62 for prob- 
lem expression, .59 for manner of relating, 
-73 for intrapersonal exploration, and .73 for 
positive regard. One rater apiece was used 
for the Congruence and Accurate Empathy 
Scales, due to practical limitations. In a 
study on a sample of similar cases the inter- 
rater reliability for these raters was .62 on 
congruence and .66 on accurate empathy (van 
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der Veen, 1965b). These reliabilities were 
considered sufficient to support the use of 
single raters. The reliability of the slopes is 
not likely to be much below the reliability 
of the level scores. The slopes were based on 
a number of interview points, and correla- 
tions between the early and late interview 
points were uniformly low. 

Hypotheses 1a, 1b, and 1c: Movement in 
patient process behavior is positively related 
to therapist level of conditions, to patient 
perception of therapist conditions, and to 
case outcome. 

The three first hypotheses can be con- 
sidered together since the results were largely 
negative. Seventeen out of 18 7’s between the 
slopes on the process scales and the level 
Scores on the therapist scales were not sig- 
nificant. For the slopes and the patient per- 
ceptions on the Therapist Relationship Inven- 
tory all r’s were negative, with 11 out of 12 
not significant. For slopes and the three 
case-outcome scores 8 out of 9 r’s were not 
significant. 

There are several possible explanations for 
the lack of significant findings for process 
movement. Technical shortcomings in scale 
reliability, scale construction, and segment 
selection may have hidden actual movement. 
Also, it is possible that the theory of linear 
patient movement Oversimplifies the change 
process. That some type of movement occurs 
over therapy which is related to process be- 
havior is supported by a study on the present 
sample (van der Veen & Stoler, 1965) in 
which it was found that therapist estimates 
of change in the patient, particularly change 
in interview variables, were related to process 
movement. However, the lack of clear-cut re- 
sults in this area suggests a need for caution 
in the conceptualization of the change process 
in the patient. 

Hypothesis 2a: Case outcome is positively 
telated to level of therapist conditions. 

The correlations of the case-outcome mea- 
sures with the overall means on the therapist 
condition scales are presented in Table 1. 
Empathy showed strong relationships with 
outcome for the overall mean as well as the 
initial, 25%, 50%, and 75% points, particu- 
larly with the combined outcome score. 
Congruence had borderline (p < .10) rela- 
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TABLE 1 


CORRELATIONS BETWEEN THERAPIST CONDITIONS 
(OVERALL MEANS) AND CASE-OUTCOME 
MEASURES 


Posi- 
Con- tive 


Measure gruence regard Empathy 


Combined outcome 

(n = 14) 46 A0 YP hed 
MMPI change ( = 13) 36 40 .58* 
CLIN rating (n = 14) 19 12 .49* 


*p <.05, 
** p <01. 


tionships with the combined outcome score 
for the overall mean and for the 25% 
(r=.47) and 75% (r=.48) points of 
therapy. The overall mean for positive re- 
gard did not reach significance, though the 
25% point correlated significantly with the 
combined outcome score (.55) and with 
MMPI change (.59). All p values in the 
study are for two-tailed tests of significance. 

Therapist conditions for the more success- 
ful cases were compared with those for the 
less successful cases on each of the three 
therapist scales, with the sample divided at 
the middle on the basis of the combined 
outcome score. The highest level reached by 
the therapist at any point of therapy, the 
lowest level, and the range between them were 
obtained for each case. Using Fisher’s exact 
test it was found that significantly more 
therapists of more successful cases attained 
Stage 3.5 on congruence (range 1-5) and 
Stage 6.5 on accurate empathy (range 1-9) 
than the therapists of less successful cases. 
Also, fewer therapists of more successful cases 
fell below 2.5 on congruence and below 6.0 
on accurate empathy than therapists of less 
Successful cases. These differences were sig- 
nificant beyond the .05 level for a two-tailed 
test. Similar, though nonsignificant, differ- 
ences were found on the Positive Regard 
Scale. 

Critical success behavior by the therapist, 
based on the stage descriptions, can be 
characterized as conveying an accurate sense 
of the patient’s and his own experience, in a 
way that furthers self-exploration by the pa- 
tient. A critical failure level for the therapist 
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consisted of withholding himself and not 
clearly pointing the patient toward deeper 
self-exploration. 

The findings for the correlations with out- 
come and for the critical high and low scale 
stages provide support for the relationship of 
therapist conditions to outcome, most strongly 
for the condition of empathy, somewhat less 
so for congruence, and least for positive 
regard. 

Hypothesis 2b: Case outcome is positively 
related to the patient’s perception of therapist 
conditions. 

There were no significant correlations be- 
tween outcome measures and the Patient 
Relationship Inventory scores. Case outcome 
was, therefore, not associated with degree of 
perceived conditions. This finding does not 
support that of Barrett-Lennard (1962), 
possibly because his results were for a less 
disturbed population. The actual conditions 
provided by the therapist may not be as 
potent a factor for the perceptions of more 
disturbed patients. A sample of outpatient 
cases was found to perceive significantly 
higher therapist congruence, positive regard, 
and unconditional regard than the inpatient 
schizophrenic cases in the present sample, 
in an unpublished study by the author, Also, 
lower conditions were perceived by schizo- 
phrenic patients who were higher on the 
Psychasthenia and Schizophrenic scales of the 
MMPI (van der Veen, 1961). ) 

Gendlin (1964, pp. 135-136) suggests that 
the perception of the therapist’s attitudes is 
secondary to the psychological change process 
in the patient (carrying forward implicit 
meanings), and that the process may occur 
before positive attitudes are perceived. How- 
ever, as will be evident below, the perception 
of therapist attitudes and the process level 
of the patient are not independent of one 


another. 
Additional Findings 

In addition to the major hypotheses, sev- 
eral other analyses of the interview and test 
data were carried out. 

Patient process level and case outcome. 
A study by Tomlinson and Hart (1962) 
reported negative results for the relationship 
between process movement and case outcome 
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TABLE 2 


CORRELATIONS BETWEEN PATIENT Process LEVELS 
(OVERALL MEANS) AND CASE-OUTCOME 


MEASURES 
Problem Personal Manner of 
Measure expression exploration relating 
Combined outcome 
score (n = 14) .62* 67 50 
MMPI change 
(n = 13) .68** 74" 45 
CLIN rating 
(n = 14) .58* -70** 48 
* 405. 
=p S01 


similar to those of the present study, but for 
nonpsychotic cases. However, they found 
significantly higher levels of process in the 
more successful cases. Correlations between 
overall mean patient levels and case-outcome 
measures for the present sample were found 
to be positive and significant for patient 
problem expression and personal exploration 
and borderline (p < .10) for manner of re- 
lating (see Table 2). In addition to the over- 
all means, significant correlations with out- 
come were obtained for every therapy point 
except the initial interview on one or more of 
each of the patient process scales. These 
results clearly show a positive association 
between the level of patient process behavior 
and his personality change over therapy. 

The highest process levels reached during 
therapy by the more successful patients were 
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compared to those of the less successful pa- 
tients in the same way as was done for the 
therapist conditions. Using Fisher’s exact test, 
it was found that significantly more of the 
more successful patients showed behavior at 
the fourth stage of the Problem Expression 
Scale (range 1-7) and the sixth stage of the 
Personal Exploration Scale (range 1-9) than 
the less successful patients. Condensing 
these stage descriptions, patients who showed 
greater improvement over therapy were more 
likely to express personally relevant material 
in relation to problems. Low scale points and 
the range between low and high points did 
not differentiate between the more and less 
successful groups. 

In a study of less disturbed outpatient 
cases a surprisingly similar association was 
found between degree of personal exploration 
of problems in the initial interview, measured 
on a different instrument, and case success 
(Kirtner & Cartwright, 1958). The present 
finding broadens the association to include 
severely disturbed persons and behavior 
throughout the course of therapy. In view of 
the negative findings for the relationship of 
patient movement to outcome, these results 
suggest that personality change may need to 
be considered as more of an enduring charac- 
teristic of the psychologically well-functioning 
individual (e.g, Gendlin, 1964) than it 
usually is. 

Patient process level and perceived thera- 
bist behavior. Table 3 presents correlations 


TABLE 3 
CORRELATIONS BETWEEN PATIENT PERCEPTION OF THERAPIST CONDITIONS AND PATIENT PROCESS LEVELS 


Patient Relationship Inventory 


J Uncondi- 
Point of Positive Congru- tionality 
Process scale therapy regard Empathy ence of regard Total 
Problem expression Initial 42 37 AT 31 50 
uM 22 15 .65** 24 40 
Personal exploration Initial 54* 36 .70** 39 64" 
M 45 —.02 .65** 19 Al 
Manner of relating Initial 36 67#* .67** 44 .68** 
M .20 22 82** 40 52* 
Note.—n = 15, 
*p <.05. 
> <.01. 


ae 
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TABLE 4 
CORRELATIONS OF THERAPIST PERCEPTIONS OF THERAPIST CONDITIONS WITH RATED CONDITIONS 
AND CASE LENGTH 
Therapist Relationship Inventory 
Ls Uncondi- Length 
Conditions Therapy Positive Congru- tionality (no. inter- 
scale point regard Empathy ence ofregard Total views) 
Congruence Initial .30 153" 40 -63* 955 .62* 
M 21 7 42 .56* 49 .61* 
Positive regard Initial 49 Oe 25 .62* 55t AT 
M 16 36 42 46 41 S24 
Empathy Initial 47 40 25 48 46 .08 
M 37 50 44 -58* .54* .18 
Note.—n = 15, 
*p <.05, 


of Patient Relationship Inventory scores with 
patient levels of process at the initial therapy 
point and averaged over therapy. Perceived 
congruence showed pervasively strong rela- 
tionships to patient behavior for the initial 
point and the overall mean, as well as the 
25%, 50%, and terminal points (not shown). 
Other perceived attitudes were also related 
to the initial interview point. 
3 These findings are highly revealing. The pa- 
tient’s perception of the therapist, especially 
the extent to which he saw the therapist as 
being openly himself while in the relationship 
with him, was strongly and pervasively re- 
lated to the patient’s process behavior. Sur- 
Prisingly, process behavior in the initial inter- 
view tended to show strong relationships to 
therapist conditions perceived at least 3 
Months hence. The findings argue for the 
relevance of personality factors, factors inde- 
Pendent of the particular relationship, for the 
Patient’s therapeutic behavior and for his 
Perception of therapeutic attitudes. 
Therapist conditions and perceived thera- 
bist conditions. Therapist conditions were not 
Telated to the patient’s perception of them. 
This is consistent with the overall lack of 
relationship in our data between therapist 
and patient interview variables, except for 
the higher process behavior of patients who 
Perceived higher therapist conditions. 
However, as would be expected, the thera- 


 Pist’s perception of himself (Therapist Rela- 
_ tionship Inventory) was significantly related 


to independent ratings of therapist conditions. 
The results for the initial point and overall 
mean are presented in Table 4. There were 
many (13) additional significant associations 
at the other therapy points, particularly the 
terminal point. 

As with patient perceptions, significant 
associations existed between behavior in the 
initial interview, when there had been no 
previous contact with the patient, and the 
perception of attitudes after 3 months of 
therapy. Therapist personality factors, there- 
fore, are likely to have contributed signifi- 
cantly to the therapist’s view of his attitudes 
toward the patient. 

Length of therapy. Length of therapy 
showed significant relationships to therapist 
conditions, with the exception of empathic 
understanding (see Table 4). Congruence at 
all therapy points was particularly strongly 
related to length, adding to the evidence that 
more effective therapist behavior is associated 
with an interaction of longer duration (van 
der Veen, 1965b). Length was not related to 
the other variables. 


CONCLUSION 


In the rating procedure raters heard both 
therapists and patients while making the 
ratings. The possibility exists that the level 
of behavior of one influenced the rating of 
the other; for example, therapists may have 
been rated higher with “good” clients, and 
vice versa. However, as was found in a pre- 
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vious study (van der Veen, 1965b), there 
were a number of indications that therapist 
and patient ratings were independent of 
one another. They were not significantly 
correlated with each other and had many 
differential relationships with other case 
variables. 

With respect to the hypotheses of the 
study, the relationship of process movement 
by the patient to therapist conditions and to 
case outcome was not supported. Rather, out- 
come was related to level of process, particu- 
larly to manner of problem expression and 
personal exploration. The hypotheses concern- 
ing the level of therapist conditions in rela- 
tion to case outcome were supported for 
objectively rated conditions, especially em- 
pathy, but not for their perception by the 
patient. Duration of therapy was positively 
associated with therapist congruence and 
positive regard. 

In the analyses of specific scale ratings it 
was found that cases were more likely to be 
successful when the patient engaged in the 
exploration of personal events in relation to 
problems. Critical success behavior for the 
therapist consisted of accurately expressing 
the patient’s and his own experience and 
clearly pointing the patient toward further 
self-exploration. On the other hand, thera- 
pist failure behavior consisted of withholding 
himself and not pointing the patient toward 
self-exploration. Such an operational delinea- 
tion appears entirely consistent with clinical 
theory. 

The patient’s perception of higher therapist 
conditions, especially the genuineness of the 
therapist, was associated with a higher level 
of process behavior. The relevance of person- 
ality factors for the perception of conditions 
was indicated by the strong associations be- 
tween process behavior at the very beginning 
of therapy and the patient’s Perception of 
therapist attitudes obtained after several 
months of interaction. Similarly for the 
therapist, his own perception of his conditions 
was clearly related to objective ratings of 
them, and conditions provided at the initia- 
tion of therapy were associated with the way 
they were perceived several months later. 

While no claim can be made that the 
research approaches an exhaustive study of 
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the factors that influence therapeutic person- 
ality change, a consistent pattern emerged 
between the three major factors that were 
investigated: interview dimensions, case out- 
come, and participant perceptions. Stated 
briefly, the pattern that emerged is that the 
therapeutic behavior of both the patient and 
the therapist is positively related to case out- 
come and to the degree each perceives high 
therapist conditions. It may be tentatively 
concluded that a sequence such as the fol- 
lowing is likely to occur: when the therapist 
is perceived by both patient and therapist 
as genuine, empathic, and acceptant, then 
both behave in ways that foster the patient’s 
personal exploration of problems, which in 
turn leads to successful therapy outcome. 
Though these results are clearly exploratory, 
their coherence is encouraging. An urgent 
question implicit in the findings is how to 
provide conditions conducive to personality 
change for patients whose process levels and 
perception of positive interpersonal attitudes 
are very low. This and other studies suggest 
that these patients are not likely to be helped 
through psychotherapy. 
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EFFECTS OF ROLE DEMANDS AND TEST-CUE PROPERTIES 
UPON PERSONALITY TEST PERFORMANCE 


ROLF O. KROGER 1 


University of Toronto 


From the assumption that the situation affects test performance by generating 
a set of role demands, an experiment was designed in which 2 randomly con- 
stituted groups of ROTC cadets were asked to describe themselves on the 
SVIB and Welsh Figure Preference Test after being exposed to implicit social 
cues intended to induce differential role taking. Highly reliable, role-specific 
response differences were obtained on both tests. These differences increased 
when controls for accuracy of role perception and for test-cue properties were 
introduced. The results were interpreted as supporting the hypothesis and 
as favoring the conclusion that the test score represents a trait-method-role 


unit, 


Situational influences in personality test 
performance have usually been investigated 
by giving explicit instructions to Ss or by 
varying the motivational attributes of testing 
situations (Kroger, 1963). In this experi- 
Ment, an attempt was made to determine if 
test responses can be altered by implicit, non- 
motivational manipulations of the setting. 

The results of any personality test may be 
tegarded as including at least three com- 
ponents. The first is a consequence of the 
characteristics of the individual; the second is 
a function of the method of assessment used 
(Campbell & Fiske, 1959); the third is a 
tesult of the unintended, social cues which 
accompany any testing situation, for example, 
cues arising from the characteristics of the 
examiner, from the instructions, and from the 
test titles. These cues are seen as generating 
a set of role demands (Sarbin, 1964) which 
induce the testee to respond to the implicit 
requirements of the situation, much as the 
experimental S responds to the requirements 
of the “social contract” between E and S$ 
(Orne, 1962). 

A distinct set of cues is given off by the 
test items. These cues aid S in responding to 
the role demands presented by the situation. 


1 Based on portions of a doctoral dissertation sub- 
mitted to the University of California, Berkeley. I 
am indebted to Theodore R, Sarbin, chairman, and 
M. Brewster Smith and Erving Goffman, members 
of my thesis committee, for advice and constructive 
criticism. A shorter version was read at the Western 
ee Association meeting, Portland, Oregon, 
1964, 


For example, if S is induced to take the role 
of salesman and if he is to enact that role 
using items of the test, it is necessary that 
the test contain items relevant to the charac- 
teristics of salesmen. Test-cue properties may 
be defined in terms of the extent to which the 
items provide information allowing role-rele- 
vant responses. The presence of such cues is 
thus a necessary condition for the occurrence 
of role-demand effects. 

In this experiment, role demands were ma- 
nipulated by varying the test titles, the set- 
ting, the ostensible purpose of testing, and 
the position of the examiner. The Ss were 
given no special instructions, but merely asked 
to furnish self-descriptions. Members of a 
Naval ROTC unit were randomly assigned to 
two conditions: (a) a “military” one where 
the ostensible purpose was the study of off- 
cer effectiveness, the E a military officer, and 
the tests labeled as devices for the study of 
military officers; (b) an “artistic” one where 
the ostensible purpose was the study of ar- 
tistic creativity, the E a psychologist, and the 
tests labeled as devices for the study of ar- 
tistic creativity. 

In the first condition, Ss were expected to 
take the role of “experimental S in a study of 
officer effectiveness” (military officer) in re- 
sponse to the role demands and to enact the 
role by endorsing test items in role-relevant 
ways. In the second condition, Ss were ex- 
pected to take and enact the role of “experi 
mental S in a study of artistic creativity 
(creative artist)—Hypothesis 1. It was as- 
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sumed that Ss knew the role expectations for 
officer and artist. This assumption seems 
plausible in view of the familiarity of these 
roles in our culture, and it is indirectly sup- 
ported by the role-specific differences found. 
Since there are individual differences in the 
accuracy with which Ss identify role demands, 
it was expected that Ss whose identifications 
are inaccurate would not enact the induced 
roles as effectively as Ss whose identifications 
were more accurate—Hypothesis 2. Finally, 
since test cues vary in their degree of rele- 
vancy, it was expected that role enactment 
would be more effective when S was confronted 
with items containing highly salient cues than 
when he was confronted with items containing 
less salient or irrelevant cues—Hypothesis 3. 


METHOD 
Subjects 


All Ss were undergraduates and members of 
the Naval ROTC unit at the University of Cali- 
fornia, Berkeley. Three random samples of 50 men 
each were drawn from the roster of the unit which 
contained a total of 246 men. One sample was as- 
signed at random to the “military” condition; the 
other two samples were assigned on the same basis 
to the “artistic” condition. 

A larger number of men had to be drawn for the 
artistic condition to allow for attrition which was 
expected to occur because of the method of recruit- 
ment used. The design of the experiment required 
that the Ss intended for the artistic condition re- 
main unaware of the connection of the experiment 
with the ROTC unit. Letters were therefore sent to 
the homes of these men, They were asked to vol- 
Unteer as subjects for a study of “artistic creativity” 
Conducted by the department of psychology and 
told in the letter that they had been selected as 
Part of a “representative sample of university stu- 
dents.” Of the 100 cadets so contacted, 55 appeared 
for the experimental session. The Ss who refused to 
Volunteer did not differ from those who agreed to 
Volunteer in terms of academic major or year in 
College; for example, the volunteer and nonvolunteer 
groups contained a similar number of engineering 
students, This finding does not rule out, but only 
minimizes, the possibility that the volunteers and 
nonvolunteers differed in artistic interest. The cadets 
assigned to the military condition were asked by 
one of their officers to volunteer as Ss for a research 
study. Of this group 45 members agreed to par- 
ticipate, Thus, the total number of Ss in the study 
Was 100. 


Experimental Situations 


The 45 military Ss were assembled in a regular 
'C classroom. The Æ was an instructor in naval 
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science with the rank of lieutenant.2 He was in 
uniform while serving as E. The Ss were told that 
the general purpose of the study was to determine 
“what makes a good military officer’? and assured 
that the individual results would not be seen by 
their superiors but forwarded immediately to the 
Sponsoring psychology department. The SVIB 
(Strong, 1951) and drawings from the Welsh Figure 
Preference Test (WFPT; Welsh, 1959) were repro- 
duced on stencils and given the titles, “Military In- 
terest Questionnaire” and “Military Aptitude Test 
IV: Spatial Organization,” respectively. 

The setting of the artistic condition was a class- 
room in the psychology building. The Æ for this 
condition identified himself as a psychologist, indi- 
cating that the purpose of the study was to deter- 
mine “what makes people artistically creative.” The 
SVIB and the drawings from the WFPT were given 
the titles of “Artistic Interest Questionnaire” and 
“Artistic Aptitude Test IV: Spatial Organization.” 
To add to the credibility of the situation, color re- 
productions of four paintings (Rouault, “The Old 
King”; Modigliani, “Portrait of a Girl”; Jawlensky, 
“Physiognomie”; Klee, “Magic Fish”) were mounted 
on the blackboard with instructions to name the 
painters and to rate the paintings in order of pref- 
erence prior to taking the tests. 

In each condition, the WFPT was given before the 
SVIB which was followed by a _postexperimental 
inquiry and, finally, an explanation of the experi- 
ment. The Ss were clearly instructed to describe 
themselves; they were not asked to respond as if 
they were officers or artists. In short, the standard 
test instructions were given. 


Measurement of Role Enactment 


The effectiveness of role enactment was assessed 
by determining Ss’ responses to items and scales 
whose role-relevant cue properties had been inde- 
pendently identified, The assumption underlying this 
method is that a role may be enacted by a variety 
of means, for example, gross skeletal movements, 
posture, styles of speech, verbal utterances, including 
the checking of test items in role-appropriate ways. 

Drawn from an introductory psychology class, 10 
judges were asked to rate each of the 400 figures of 
the WFPT and each of the 400 items of the SVIB 
on an eight-point scale for their descriptiveness of 
the likes and dislikes of military officers. The means 
and standard deviations of the 10 ratings were com- 
puted for each of the 800 items. The reliability of 
the average ratings was assessed through a pro- 
cedure devised by Ebel (1951) and found to be .81 
(p < 01) and .84 (p< 01) for the WEPT and 
SVIB, respectively. From the WFPT, the 40 items 
having received the highest mean ratings were des- 
ignated as being highly descriptive of the likes of 
military officers, the 40 items having received the 
lowest mean ratings as being highly descriptive of 
the dislikes of military officers. This procedure is 


2I am indebted to Professor Praetorius for serving 
as E in this condition. 
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based on the assumption that role expectations (in- 
cluding likes and dislikes) for a given social position 
are held not only by incumbents of that position, 
but also by incumbents of other positions; in the 
case of familiar ones, they may be held by all mem- 
bers of a culture with varying degrees of accuracy. 
The WFPT score of role enactment thus consisted of 
the number of “like” responses to the 40 high-rated 
items plus the number of “dislike” responses to the 
40 low-rated items. Both the liking of high-rated 
items and the disliking of low-rated items reflect the 
commonly held preferences of military officers ; 
hence, the higher S’s score, the greater his effective- 
ness of enacting the officer role in terms of these 
items, and, conversely, the lower his score the greater 
his effectiveness of enacting the artist role. It is 
assumed that the preferences of military officers and 
artists are held to be largely opposed. This assump- 
tion is supported by inspection of the high- and 
low-rated items. 

From the SVIB, 20 items each from those falling 
in the top and bottom 10% of the frequency dis- 
tribution of mean ratings were selected in an effort 
to obtain (a) 20 items judged to be liked by mili- 
tary officers and disliked by artists and (b) 20 items 
judged to be disliked by military officers and liked 
by artists. These items were defined as possessing 
highly salient or “primary” cue properties for the 
officer and artist roles. The SVIB score of effective 
role enactment thus consisted of the number of like 
(or indifferent) responses to the 20 high-rated items 
plus the number of dislike (or indifferent) responses 
to the 20 low-rated items. Again, the higher S’s 
score the greater his effectiveness of enacting the 
officer role in terms of SVIB items, and, conversely, 
the lower his score the greater his effectiveness of 
enacting the artist role. 

The third measure of effective role enactment was 
the set of occupational scales of the SVIB, includ- 
ing the M-F scale. These scales were divided into 
three categories containing, respectively, scales de- 
fined as having “primary,” “secondary,” and “neu- 
tral” cue properties to permit testing of the effect 
of differential cue properties of test items upon test- 
taking behavior (see Tables 1-3). 

A scale was defined as Possessing primary cue 
properties if the content of the scale could be said 
to reflect clearly the role expectations for officer and 
artist. It was assumed, for example, that artists like 
the occupations Artist and Musician, as well as the 
likes and dislikes associated with these occupations, 
and that military officers dislike these occupations 
and their associated preferences. 

A scale was defined as having secondary cue prop- 
erties if its content merely correlated with the role 
expectations for officer and artist. The assignment of 
scales to the secondary category was made on basis 
of Strong’s (1951) factor analytic isolation of 11 
independent groups of occupations prior to the col- 
lection of data, The pattern of intercorrelations for 
the Present sample, however, did not differ appre- 
ciably from Strong’s. 

If a difference between the experimental groups is 
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predicted for one of the primary scales within a 
given occupational grouping, differences for the other 
scales within that grouping may be expected by 
virtue of the intercorrelations of the scales. While 
the predicted direction of the difference between ex- 
perimental groups is thus the same for both pri- 
mary and secondary scales, the magnitude of the 
difference is not expected to be the same, because 
the secondary scales offer less salient cues for role 
enactment. All nonprimary scales found in the same 
grouping as one of the primary scales were there- 
fore assigned to the secondary category. In addi- 
tion, the scales in Group VIII (business detail) were 
included in the secondary category on the basis of 
independent findings that Group VIII reliably dif- 
ferentiates artists (MacKinnon, 1962) and officers 
(Gough, 1958). 

All remaining scales were defined as having neu- 
tral cue properties, or as failing to offer cues rele- 
vant for the enactment of the two roles. Differences 
between experimental groups were not expected for 
these scales. 


Postexperimental Inquiry 


The effects of the situation on test-taking behav- 
ior, specifically the effects of S’s perception of role 
demands, may be assessed not only by means of 
inferences from S’s test-taking behavior but also 
from S’s report of his perceptions. It is necessary, 
however, to interpret such data cautiously since 
Postexperimental inquiries do not furnish pure re- 
sponses to experimental manipulations, but may be 
contaminated by S’s intervening, experimental re- 
sponses, 

Since an individual postexperimental inquiry was 
not feasible, Ss were asked to report their views of 
the testing situation in written, free response form. 
The resulting protocols were rated by two inde- 
pendent raters on a four-point scale designed to 
reflect whether or not S had become aware of E's 
Purpose or “misperceived” the role demands built 
into the experiment, The points on the scale were 
defined as 1, aware, for example, S states explicitly 
that the experiment had nothing to do with assessing 
military officers or artistic creativity; 2, somewhat 
aware, for example, S voices doubts about purpose 
of experiment as reflected in at least some of the 
categories of the inquiry; 3, somewhat unaware, fot 
example, S does not appear to have discovered E's 
purpose, but also fails to mention ostensible pur- 
pose; 4, unaware, for example, S explicitly accepts 
ostensible purpose. Interrater reliability was 83 for 
96 protocols (four Ss failed to complete the inquiry). 
Eight military and eight artistic Ss received a rating 
of 1 or 2; they were labeled “inaccurate role 
perceivers,” 


RESULTS 
Effects of Role Perception and Role Demands 


Hypothesis 1, stating that test-taking be 
havior varies directly with the social position 
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occupied by S in the testing situation, was 
tested by comparing the responses of Ss in 
the military and artistic conditions on the 
three measures of role enactment. Since the 
direction of the mean differences was pre- 
dicted, one-tailed ¢ tests were used, except in 
the analyses involving scales with neutral cue 
properties, where no differences were pre- 
dicted. It is to be noted that the direction of 
the predicted differences varies with the con- 
tent of the scales involved. 

The results (Tables 1-3) are given sepa- 
rately for the three levels of the cue-property 
variable to facilitate the presentation of later 
analyses. Each table presents two analyses: 
(a) comparison of experimental groups con- 
taining all Ss who participated and (b) repli- 
cation of the comparison after elimination of 
those Ss who were identified as inaccurate 
role perceivers by the postexperimental in- 
quiry. 

The results of the analysis involving mea- 
sures with primary cue properties (Table 1) 
clearly permit the rejection of the null hy- 
pothesis. The two experimental groups per- 
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formed differently on all 12 scales with high 
reliability; moreover, their performances dif- 
fered in the expected directions without ex- 
ception. It is clear that the military Ss en- 
dorsed a greater number of items relevant to 
the officer role and that the artistic Ss en- 
dorsed a greater number of items relevant to 
the artist role, as predicted by Hypothesis 1. 
The alternative and more traditional hypoth- 
esis, that responses to personality tests are 
determined primarily by the personal char- 
acteristics of the testee and are therefore in- 
dependent of the conditions of measurement, 
appears to be incompatible with the present 
results. 

Table 2 presents the results for the role- 
enactment measures having secondary cue 
properties. Here, the null hypothesis may be 
rejected in 15 out of 21 comparisons when 
the total number of Ss is considered and in 11 
out of 21 comparisons when inaccurate role 
perceivers are excluded from the analysis. 
There were no reversals in the predicted di- 
rections of performance. Confronted with 
items having only secondary cue properties, 


TABLE 1 
Comparison OF MILITARY AND Artistic Groups ON ROLE-ENACTMENT MEASURES WITH 
PRIMARY CUE PROPERTIES 
All subjects Accurate subjects only 
ili Artistic Military Artistic 
rer N=55 N=37 N=47 
Measure X PDs T b x PD* Re wb 
WE 
‘hee 55.80 > 46.94 3.72808 59.16 > 46.82 5.01» 
S à 
on 33.51 27.13 i amaid 33.78 27.28 8.230 
Artist. 2187 < 29.63 3.32**** 2045 < 29.61 3.5888" 
Architect 22.60 29.44 2.82 21.59 29.00 2.80*** 
Musician 30.86 38.16  3.40%#*** 30.81 seat seats 
Advertising 30.71 36.85  3.23*#** 29.51 . SR 
Author 28.84 34.78  3.10**** 27.56 35.08 aor: 
Aviator 4138 > ©3625 (2.024 eee Ra hl E 
Army officer 38.60 32.54 PEH aa 22.08 322k 
ages BA ee eee 5121 > . 4310 3.90" 


à Predicted direction of mean difference between groups. 
b One-tailed test. 
*D <.025. 
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TABLE 2 
COMPARISON OF MILITARY AND Artistic GROUPS ON ROLE-ENACTMENT MEASURES WITH 
SECONDARY CUE PROPERTIES 
All subjects Accurate subjects only 
Military Artistic Military Artistic 
N=45 N=55 N =37 N=47 
Measure X Pp x b Z PD x i 
Psychologist 29.31 < 32.89 5 29.37 < 32.40 1.31 
Physician 32.22 34.16 <i 32.43 33.82 <1 
Psychiatrist 32.15 34.16 <1 32.54 33.59 <i 
Osteopath 31.84 31.14 <1 32.45 31.31 <1 
Dentist 23.64 24,83 <1 23.56 24,68 <1 
Music teacher 25.37 30.18 211% 26.27 29.85 1.49 
Lawyer 32.28 36.41 1,93* 30.75 36.59 2.53°9* 
Production 
manager 37.15 > 32.38 2.28** 37.21 > 32.36 21305 
Farmer 35.37 30.45 2.32** 36.59 30.17 2.970 
Carpenter 23.08 18.40  1.73* 24.35 18.04 = 2.17** 
Printer 34.11 32.58 <1 35.35 32.04 1.61 
Math-Science 
teacher 34.35 30.49 1,68* 36.37 30.10 2.49" 
Sr. CPA 41.24 35.07 3.18**** 42.97 34.82 3.060 
Jr. accountant 30.15 24.70 2.51" 31.29 25.02 2.6755 
Office worker 32.13 28.52 1.68* 33.35 29.08 1,82* 
Credit manager 38.44 33.47 2.10** 40.18 33.08 2.80*#** 
Purchasing 
agent 30.44 > 26.50 1,69* 30.45 > 27.10 1.36 
Business educa- 
tion teacher 34.37 30.69 1.71* 36.00 30.42 2.42 
Banker 25.00 21.01 1,92* 25.48 21.97 1.59 
Mortician 28.55 27.09 <i 28.75 27.36 <i 
Pharmacist 30.55 27.54  2.18** 30.86 27.89 1.91* 
* Predicted direction of mean difference between groups. 
b One-tailed test. 
*p <.05. 
KD <.025, 
ED <A01, 
+k p <.005. 
b < 0005. 


Ss continued to endorse items relevant to 
their respective roles, although now they had 
less opportunity to do so. There is clearly an 
interaction effect between the role perception, 
role demand, and cue-property variables. An 
analysis of this effect is presented separately 
below. 

Table 3 presents the results for the role- 
enactment measures having neutral cue prop- 
erties. The results indicate only one signifi- 
cant difference out of 20 comparisons when 
the total number of Ss was considered, and 
only three departures from zero when “in- 
accurate perceivers” were excluded from the 
analysis. The trend in these results is congru- 
ent with the prediction that differential role 


enactment becomes impossible in the absence 
of role-relevant cues. 


Efect of Accuracy of Role Perception 


Hypothesis 2, stating that test-taking be- 
havior varies directly with the accuracy of 
S’s perception of role demands, was tested by 
contrasting the performance of the two ex- 
perimental groups before and after elimina- 
tion of inaccurate role perceivers. If test- 
taking behavior varies with accuracy of role 
perception, the differences between the groups 
should increase after inaccurate perceivers 
are eliminated from the analysis. 

Accordingly, mean differences were com- 
puted for (a) the comparisons shown in Ta- 
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TABLE 3 
COMPARISON OF MILITARY AND ÅRTISTIC GROUPS ON ROLE-ENACTMENT MEASURES WITH 
NEUTRAL CUE PROPERTIES 
All subjects Accurate subjects only 
Military Artistic Military Artistic 
N=45 N =55 N=37 N=47 
Measure x E4 a bg b< ta 

Veterinarian 24.97 21.07 2.96*** 25.65 20.89 3.278%, 
Mathematician 20.46 22.40 <1 19.86 22.23 <1 
Physicist 23.44 24.00 <1 23.51 23.61 <1 
Engineer 31.95 30.30 <i 31.48 29.91 <1 
Chemist 31.93 31.81 <i 31.64 31.42 <1 
YMCA physical 

director 29.93 27.03 1.25 31.56 26.70 1.89 
Personnel director 34.84 33.34 <1 35.64 32.80 1.11 
Public administrator 41.00 38.07 1.50 42.00 37.63 1.98* 
Vocational counselor 33.75 32.00 <1 35.29 31.46 1.64 
YMCA secretary 23.02 22.69 <1 24.48 22.57 <1 
Social science 

teacher 29.95 29.65 <1 31.62 29.63 <1 
Physical therapist 37.82 33.45 1.71 39.91 32.72 2.61** 
School superintendent 21.22 21.56 <1 21.59 21.51 <1 
Social worker 30.28 32.12 <1 31.29 31.61 <1 
Minister 18.06 21.58 1.50 18.72 21.04 <1 
CPA partner 27.57 28.56 <1 26.81 28.89 1.08 
Sales manager 31.55 30.49 <i 31.05 30.57 <1 
Real estate sales 36.31 36.72 <i 35.72 37.14 <1 
Life insurance sales 29.28 31.00 <i 28.64 31.23 1.13 
President, manufac- 

facturing 29.93 30.60 <1 28.52 30.87 <i 


Note.—No directional predictions of mean differences were made for these scales. 


® Two-tailed test. 


bles 1, 2, and 3 involving all Ss (Difference 1) 
and (b) the comparisons in the same tables 
involving accurate Ss only (Difference 2), 
with the prediction that Difference 2 > Dif- 
ference 1. The nonindependence of most of 
the SVIB scales precluded the statistical com- 
Parison of the entire array of increases in 
group differences. To obtain some estimate of 
reliability, one scale was randomly selected 
from each of nine factorially isolated group- 
ings of the SVIB; in addition, the independ- 
ent WFPT score was included. Results of a 
Sign test (Siegel, 1956) indicate reliably 
larger differences between groups in the pre- 
dicted direction when inaccurate perceivers 
are eliminated from the comparisons (v= 1, 
P< 01). These results must be interpreted 
Cautiously, since the independence of the 
SVIB factorial clusters is incomplete; they 


are supported, however, by inspection of the 
number (38 out of 53) and magnitude of the 
increases in mean differences shown in Tables 
1-3. It may therefore be concluded tentatively 
that accuracy in the perception of role de- 
mands leads to more effective role enactment 
in terms of test items, as predicted by Hy- 
pothesis 2. 
Effect of the Cue Properties of the Test 
Hypothesis 3, stating that test-taking be- 
havior varies directly with the role relevancy 
of test items, was examined by comparing Ss? 
responses to scales differing in the saliency of 
their cue properties. To this end, the mean 
difference scores obtained from comparing 
military and artistic Ss on the SVIB scales 
may be used. If the cue properties of the test 
limit or enhance the possibilities for effective 
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TABLE 4 


MEANS AND STANDARD DEVIATIONS OF MEAN DIFFER- 
ENCE Scores OBTAINED FROM COMPARING MILI- 
TARY AND Artistic Groups ON SVIB SCALES 


AT THREE LEVELS OF CUE 
PROPERTY 
Scales No. of scales x SD 
Primary 10 6.50 0.79 
Secondary 21 3.54 1.47 
Neutral 20 1.59 1.16 


role enactment, the difference scores may be 
expected to increase as responses to neutral, 
secondary, and primary items are considered 
in turn, 

The means and standard deviations of the 
difference scores for the three levels of the 
cue-property variable are presented in Table 
4 


It is clear from inspecting Table 4 that the 
mean differences between the experimental 
groups vary, in the predicted order, with the 
level of the cue properties of the SVIB scales, 
Comparisons of the individual mean differ- 
ence scores by chi-square tests yielded a value 
of 11.13 (df= 1, p < .0005) for the differ- 
ence between primary and secondary scales, 
a value of 12.15 (df=1,p< 0005) for the 
difference between primary and neutral scales, 
and a value of 8.79 (df =1, $ < .005) for 
the difference between secondary and neutral 
scales. It should be noted that the results of 
the chi-square tests are only approximate, for 
the reasons given earlier, but again they are 
clearly supported by the data in Tables 1-3 
which include the data shown in Table 4. 

Therefore, it may be concluded tentatively 
that effectiveness of role enactment varies sys- 
tematically with the presence of role-relevant 
cues in the test. The more appropriate the 
cues, the greater the effectiveness of role en- 
actment. 


Discussion 


The present findings revealed quite clearly 
that individuals adapt their test Tesponses to 
situational requirements, even though they are 
in no way instructed to produce responses 
other than self-descriptions. Highly reliable 
group differences in predicted directions 
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emerged from exposing Ss to different sets of 
implicit social cues. To the extent that the 
two groups perceived the testing as being 
without consequences for their personal fu- 
tures—both worked under research conditions 
and were told that individual results would 
remain unknown—it may be concluded that 
the observed differences occurred in the ab- 
sence of differential motivational factors. Cer- 
tainly motivational differences were at a mini- 
mum compared to those found, for example, in 
studies which contrast research with employ- 
ment conditions. 

Since the experimental manipulations were 
minimal and approximated those found in or- 
dinary testing situations, the results are read- 
ily generalized, except for limitations imposed 
by the method of recruitment, the kind of Ss, 
and instruments used. The results were repli- 
cated from an unstructured test over a highly 
structured one, but the domain of personality 
tests clearly needs to be sampled more ade- 
quately before the conclusions can be ex- 
tended with assurance to the range of person- 
ality tests in use today. Since the methods of 
recruitment yielded differential rates of vol- 
unteering, the possibility is not ruled out that 
Ss with special interests in artistic matters 
were drawn into the artistic condition and that 
the group differences were thus somewhat in- 
flated. For practical reasons, it was impossible 
to equate the methods of recruitment. An- 
other limitation lies in the deliberate con- 
founding of the independent variables. The 
test titles, the ostensible purpose of testing, 
the setting, and the examiner were concep- 
tualized as operationally representing the con- 
struct of role demands. While such a proce- 
dure seems justifiable in an initial study, it is 
clearly desirable in future investigations to 
study the effect of each variable separately. 
At present, it would be idle to speculate which 
factor exerted the greatest influence or whether 
the several factors need to be present simul- 
taneously to produce the observed effect.® 

The present results would not seem to be 
readily subsumable under other current con- 
ceptions of test-taking behavior. What may be 


3A follow-up study, conducted with Thaia Rob- 
erts, failed to reveal the role-taking effect when only 
the test titles were varied. 
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identified as the traditional view insists that 
the test response is primarily a function of 
S’s enduring characteristics which are not 
easily altered by external conditions; yet two 
different but benign experimental manipula- 
tions seem to be capable of inducing marked 
response differences. The social desirability 
hypothesis (Edwards, 1957) asserts that the 
testee will strive for a “good impression,” but 
it does not predict that the impression will 
be role specific; rather, it predicts the same 
response tendency irrespective of the nature 
of the testing situation. In brief, while Ed- 
wards’ hypothesis calls attention to the influ- 
ence of social norms stated at the level of 
culture, the role-taking hypothesis calls at- 
tention to the influence of such norms at the 
more specific level of social position and is, 
therefore, able to make more specific predic- 
tions of the direction of response. Finally, the 
present results cannot be subsumed under the 
acquiescence hypothesis (Messick & Jackson, 
1961). Role enactment in terms of test items 
appears to involve more than acquiescence to 
positively stated items irrespective of their 
content; rather, it seems to require selective 
attention to the role-relevant cue properties of 
test items which then function to guide S’s 
responses in role-specific directions. It should 
be clear, however, that the present findings do 
not bear upon the tenability of the acquies- 
cence hypothesis as such. 

On the other hand, the present approach 
closely resembles the positions of clinically 
oriented writers (e.g., Rotter, 1960) who also 
emphasize the influence of the situation and 
the need to consider such influences systemati- 
cally. The chief difference between such writ- 
ers and the present approach lies perhaps in 
the avoidance of an undue emphasis upon 
motivational principles. The difficulties aris- 
ing from the discrepancy of the testee’s and 
the examiner’s goals have long been recog- 
nized (Rotter, 1960); thus, test users are ad- 
monished to establish rapport with their Ss 
Prior to testing. That more subtle cognitive 
factors also function to determine test re- 
Sponses has received far less consideration. 
The present findings seem to show, however, 
that marked motivational differences need not 
be present for response changes to occur. 
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Implications 


One of the implications of the present find- 
ings concerns the interpretation of test scores. 
Campbell and Fiske (1959) advanced the use- 
ful notion of the test score as a trait-method 
unit where both the trait postulated and the 
method used are regarded as sources of vari- 
ation. The present results, as well as much 
other evidence, point to the testing situation 
as a further source of variation. It may be 
heuristic, therefore, to regard the test score as 
a trait-method-role unit. This proposition 
would seem to recognize the full complexities 
of the problem of interpretation without giv- 
ing undue emphasis to a single point of view. 
Tt calls attention to variation arising from the 
use of different methods for measuring the 
same trait, as well as to variations arising 
from measuring the same trait in different 
situations without denying the contribution 
of personal consistencies to the test protocol. 
To deny the latter would be to deny the real, 
if limited, predictive successes achieved by 
such instruments as the SVIB, MMPI and 
CPI 

A second implication, closely related to the 
first, concerns the prediction of behavior from 
self-report measures. One of the assumptions 
underlying the role theoretical approach is 
that behavior is a joint function of personal 
and situational factors (Sarbin, 1964). If this 
assumption is reasonable, one would not ex- 
pect to obtain high validity coefficients be- 
tween test scores gathered in one situation and 
criterion behavior observed in another, In 
the language of the present approach, high 
validity coefficients may be expected only 
when the testee occupies the same social posi- 
tion in both the testing and criterion situa- 
tions. 

Last, the present findings contribute to the 
rapidly emerging area of inquiry known as 
the social psychology of the psychological ex- 
periment (eg., Orne, 1962). There is in- 
creasing evidence that S responds not only to 
the independent variables created by E, but 
also to the demand characteristics of the ex- 
perimental situation (Orne & Scheibe, 1964). 
It is entirely possible to describe the behav- 
ior of the testee in the same terms. His behav- 
jor, too, is not only a function of his person- 
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ality characteristics but also a response to the 
requirements of the total testing situation. 
Social role theory appears to have considera- 
ble heuristic utility in analyzing these re- 
quirements and for predicting their conse- 
quences for test-taking behavior. 
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SOME CONCOMITANTS OF THERAPIST DOMINANCE 


IN THE PSYCHOTHERAPY INTERVIEW 


GEORGE V. C. PARKER? 
University of Texas 


Directive (Dir) and nondirective (NDir) verbal behavior of 16 therapists dur- 
ing initial psychotherapy interviews were studied in relation to therapist 
dominance (Dom). It was hypothesized that, irrespective of client sex, thera- 
pist Dom would be related (a) positively to Dir therapist verbalizations, and 
(b) negatively to NDir therapist verbalizations. Data from 32 psychotherapy 
sessions supported both predictions. Moreover, Dir and NDir verbalizations 
were found to be reliably related to client sex, with therapists giving higher 
proportions of NDir responses to female clients than to male clients, and 
higher proportions of NDir than Dir responses to female clients. Implications 
for selection and training of therapists, predictions of therapy progress from 
therapist personality dimensions and verbal behavior, and further research 


were discussed. 


Historically, in the conceptualization of 
psychotherapy research, major attention has 
been paid to the role and attributes of the 
patient. Perhaps spurred by the widespread 
interest in the implications of findings of 
verbal conditioning studies, and their rele- 
vance for understanding the psychotherapy 
Process (cf. Krasner, 1958, 1962), recent 
years have witnessed increasing concern with 
the therapist’s contribution to the treatment 
process. However, considering the obvious im- 
portance of therapist variables for our under- 
standing of psychotherapy, it is paradoxical 
how few studies thus far have focused sys- 
tematically on investigation of the relation- 
ships between what the therapist does and 
how these activities are related to his person- 
ality. Among others, Frank (1959) and 
Strupp (1960) have given explicit recogni- 
tion to the importance of research directed 
toward clarifying the relationships between 
Personality dimensions of therapists and their 
verbal techniques. Frank (1959, p. 17) has 
noted this crucial deficit: 


The most important, and unfortunately the least 
understood, situational variable in psychotherapy is 
the therapist himself, His personality pervades any 
technique he may use, and... he may influence 
the patient through subtle cues of which he may not 
€ aware. 
— arean: 

+The author wishes to express his appreciation to 
onard D. Goodstein of the University of Cincin- 
matti for his many helpful suggestions concerning 
this study, 


Reliable relationships have been found be- 
tween classes of therapists’ verbal behavior 
and such factors as therapists’ theoretical and 
professional orientation, experience, and pres- 
ence or absence of personal analysis (Strupp, 
1955a, 1955b, 1955c, 1958). Aronson (1953) 
has reported significant correlations between 
peer ratings of personality characteristics of 
therapists (such as submissiveness and de- 
pendency) and judged directiveness and non- 
directiveness of the therapists’ verbalizations 
during therapy interviews. Others have stud- 
ied the impact of directive therapist responses 
upon subsequent client verbalizations (Carnes 
& Robinson, 1948; Frank, 1964; Frank & 
Sweetland, 1962). 

The present study was designed to extend 
this tradition into an examination of classes 
of therapists’ directive and nondirective 
verbal behavior during initial psychotherapy 
interviews as a function of the dominance 
dimension of therapists’ personality. Con- 
spicuous in Edwards’ (1959) and Gough and 
Heilbrun’s (1965) definition of dominance 
behaviors are the controlling and directing 
components which might be observed in any 
interpersonal situation, including the dyadic 
psychotherapy relationship, where most ob- 
servable behavior is verbal. It was hypothe- 
sized basically that, irrespective of client sex, 
therapist dominance would be (a) related 
positively to directive classes of therapist 
verbalizations, and (b) negatively related to 
nondirective classes of therapist verbal be- 


313 


314 


havior. Ultimately, differences in therapists’ 
verbal behavior which may be found to be 
related to psychometrically defined therapist 
personality characteristics may become more 
predictable and, thus, significant in therapy 
management and prognosis. 


METHOD 


Data used in this study were all collected through 
the usual functions and policies of the University 
Counseling Service of the State University of Iowa 
and the University Counseling Center of the Uni- 
versity of Texas with regard to clients who seek pro- 
fessional counseling primarily for personal problems, 


Therapist Sample 


Participating therapists were advanced male gradu- 
ate students who were counselors in training at the 
time the data were gathered. All 16 therapists, 
ranging in age from 25 to 44, had similar prior 
counseling experience and academic backgrounds, all 
had taken one or more graduate courses in psycho- 
logical counseling, and none had more than 2 years 
of counseling experience at the time the study was 
initiated. 


Therapist Dominance 


Therapists were given the Gough Adjective Check 
List (ACL) within 1 month of the initiation of the 
study. All ACLs were scored for the Heilbrun Need 
Scales (Gough & Heilbrun, 1965), providing rela- 
tively independent scores on 15 personality traits, 
including Dominance (Dom). The therapist’ Dom 
Scale T score distribution ranged from 42 to 76, 
similar to previously reported data of this kind 
(Heilbrun, 1961a), Division of the distribution at 
the median yielded a mean T score of 50.1 for the 
low (Lo) Dom therapist group (V=8) and 673 
for the high (Hi) Dom therapist group (N =8). 


Clients 


The clients (N =32), both students and non- 
students, had originally been seen by a professional 
staff member of the respective agency for an intake 
interview, at which time a mutual decision had 
been reached that the client would be seen further 
for personal adjustment counseling. Cases used in this 
study were then assigned to one of the 16 graduate- 
student therapists on the basis of case-load con- 
siderations. One male and one female case were 
randomly selected from each therapist’s load, making 
a total of 32 cases for Purposes of the present study. 


Therapy Protocols 


Protocols were obtained from electronically 
recorded tapes of 32 initial counseling sessions. Tapes 
Were converted into verbatim transcripts, altered only 
to maintain confidentiality. In all other respects the 
transcripts preserved the verbal content of the 
therapy sessions intact. 
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The focus of this study on data from transcripts 
of the initial phase of therapy resulted from the 
commonly held position that the initial contact be- 
tween therapist and client is, in many ways, the 
crucial interaction of therapy. As Polansky and 
Kounin (1956) point out, if the initial interview 
is unsuccessful, there may well be no further contact 
between therapist and client. The financial loss with 
premature terminators is another serious problem 
which has stimulated efforts to identify these clients 
psychometrically (cf. Heilbrun, 1961a, 1961b), 


Judges 


The judges for this study were 32 graduate stu- 
dents enrolled in graduate courses in psychotherapy, 
After familiarization with the content-categorization 
system, each judge was assigned to five different 
transcripts on a basis that assured for each judge 
at least two of the five transcripts were from male 
or female clients, and at least two of the five tran- 
scripts were from HiDom or LoDom therapists, 
Moreover, no two judges were assigned more than 
two transcripts in common, On this basis, every 
protocol was analyzed by five different judges. 


Content Analysis of Therapists’ Verbal 
Behavior 


Content analysis of the therapists’ verbal behavior 
followed a modified Snyder schema (1945, 1963), 
paralleling that described by Frank and Sweetland 
(1962), providing for the objective classification of 
each therapist statement into one of 20 categories. 
A satisfactory level of interjudge agreement has 
been reported for this kind of categorization system 
(Fogel, 1957). All responses classified identically by 
fewer than three-fifths of the judges were regarded 
as showing zero percentage of interjudge agreement 
and were not included in further analyses. Three- 
fifths, four-fifths, and five-fifths judge agreement 
were assessed as 60, 80, and 100% interjudge agree- 
ment, respectively. Averaging these percentages over 
Tesponses for the 32 transcripts individually yielded 
average percentages of interjudge agreement ranging 
from 49.5 to 87.3, with a mean of 68.5, Analysis 
confirmed that average percentage of interjudge 
agreement was independent of client sex and 
therapist Dom. 


Directive and Nondirective Response Classes 


Much as in other studies of this variable (Carnes 
& Robinson, 1948; Finesinger, 1948; Frank, 1964), 
directive (Dir) therapist responses were defined as 
those which would tend clearly to lead, direct, or 
control the verbal activity during the therapy session. 
Classes of Dir responses were asking direct questions 
(DQ), approval and encouragement (AE), giving 
information (IN), forcing the topic (FT), reas- 
surance (RS), and persuasion (PS). Nondirective 
(NDir) therapist responses were defined as those 
which would tend to give responsibility of decision 
for choice of area and direction of verbal activity 
largely to the client as well as those responses which 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS OF PROPORTIONS OF THERAPIST RESPONSES 


Level of therapist dominance 


HiDom therapists LoDom therapists 
Male Female Male Female 
Response class clients clients B clients clients x 
Dir 
DQ 24,38 25.5 24.9 13.7 10.8 12.3 
IN 3.3 2.4 2.8 4 8 6 
AE Tf v4 cs) 0.0 0.0 0.0 
FT 5.9 6.2 6.1 8.0 3.1 5.6 
RS 6.8 14 4.1 i9 1.0 a) 
PS 8 4 6 2 0.0 T 
z 41,8 36.1 39.0 23.2 15.7 19.5 
SD 21.0° 18.6 19.4 10.8 10.9 11.2 
NDir 
M 7.7 4.7 6.2 15.5 23.2 19.3 
SA 2.7 2.0 2.4 1.6 1.1 1.3 
FN 23 4.0 3.1 4.4 2.1 3.2 
ND 1.0 2 6 4.2 2.5 3.4 
RC 13.1 17.0 15.1 16.4 16.9 16.7 
CF 10.2 14.3 12.3 14.0 18.3 16.2 
z 37.0 42.2 39.7 56.1 64.1 60.1 
SD 14.4 14.8 14.3 21.8 18.2 19.7 
Xz 39.4 39.2 39.3 39.7 39.9 39.8 
SD 17.6 16.6 17.1 23.7 29.0 25.7 


a Values for subclasses of Dir and NDir responses are average proportions for the respective groups of eight therapists. 

b Totals for subclasses of Dir and NDir responses are equivalent to the mean proportions of Dir and NDir responses within 
the respective Hi- or LoDom therapist groups; that is, 41.8 is the average proportion of Dir responses given by HiDom therapists 
to male clients. To divide 41.8 by 6 would yield a value meaningless for present purpose: 


° All SDs were derived from the distributions of therapists 


S, 
overall proportions of Dir and NDir verbalizations; that is, 21.0 


is the SD for the distribution of proportions around the mean of 41.8 of Dir responses for eight HiDom therapists. 


reflect or clarify the client’s affect. Classes of NDir 
responses were mm-hmm (M), simple acceptance 
(SA), maintaining initiative for the discussion with 
the client (FN), the traditional nondirective lead 
(ND), restatement of all or part of what the client 
has just said (RC), and clarification of feeling (CF). 

Since there is no restriction on total number of 
therapist responses in studies utilizing data from actual 
therapy interviews, frequency data were converted 
into proportions for purposes of statistical analysis. 
The test of the relationship between therapist Dom, 
Client sex, and proportions of Dir and NDir therapist 
responses was made with a Type VI analysis of 
Variance (Lindquist, 1953). 


RESULTS 


Overall proportions of therapists’ Dir and 
Dir responses, from a total of 1,245 HiDom 
therapist responses and 1,621 LoDom thera- 
Pist responses, are summarized in Table 1, 
from which it can be seen that several sub- 
Categories contain most of the Dir and 


NDir responses. There is, for example, very 
little AE or PS done by either Hi- or LoDom 
therapists during the initial therapy sessions. 
On the other hand, there is a considerable 
amount of DQ which occurs during this stage, 
regardless of therapist Dom, although rela- 
tively more is done by HiDom therapists 
than LoDom therapists. 

Results of analysis of variance of these 
data yielded three significant F ratios. First, 
all therapists gave a significantly greater pro- 
portion of NDir responses (49.9) than Dir 
responses (29.2) to all clients (F = 7.86, 
df = 1, p < .02). However, as inspection of 
Table 1 suggests, this main-effect difference 
was contingent upon a significant first-order 
interaction, that is, Therapist Dom X Dir- 
NDir Therapist Response Classes. 

In support of the hypothesis which gener- 
ated this study, the analysis of the Therapist 
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Dom X Dir-NDir interaction was statistically 
significant (F = 7.35, df = 1, p < .02). This 
finding indicates that the rates at which these 
classes of therapist verbal behavior occur 
during initial psychotherapy interviews are a 
function of level of therapist Dom. Closer 
inspection of these differences, summarized in 
Table 2, indicated that, irrespective of client 
sex, HiDom therapists gave a higher propor- 
tion of Dir responses (39.0) than did LoDom 
therapists (19.5); £ = 3.48, df = 14, p < 01. 
Furthermore, HiDom therapists, regardless of 
client sex, gave a smaller proportion of NDir 
responses (39.7) than did LoDom therapists 
(60.1); £= 3.24, df = 14, p < 01. However, 
while the difference between proportions of 
Dir (39.0) and NDir (39.7) responses given 
by HiDom therapists was not significant 
(t = .20, df = 7, p > .05), the same compari- 
son for LoDom therapists was highly signifi- 
cant: Dir responses (19.5) versus NDir re- 
sponses (60.1); t= 14.96, df=7, p< 01. 
Thus considerable confirmation for the hy- 
pothesis that Dir therapist responses are re- 
lated positively to psychometrically defined 
therapist Dom, while NDir therapist re- 
Sponses are negatively related to this therapist 
dimension, can be found in these data. 

A second significant interaction, which was 
between proportions of Dir-NDir responses 
and client sex (F = 5.86, df = 1, p< .05) 
was not consistent with the predicted null 
hypothesis. Instead, it indicated that thera- 
pists’ verbal behavior is significantly related 
to the sex of the client being interviewed. 
Closer inspection of these data, summarized 
in Table 3, indicated that while all therapists 


TABLE 2 


MEANS AND STANDARD DEVIATIONS OF PROPORTIONS 
or THERAPIST DIRECTIVE AND NONDIRECTIVE 
RESPONSES AS A FUNCTION OF LEVELS 
or THERAPIST DOMINANCE 


Therapist dominance 


Response class High Low 
Dir M 39.0 19.5 

SD 19.4 11.2 

NDir M 39.7 60.1 

SD 14.3 19.7 
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TABLE 3 


MEANS AND STANDARD DEVIATIONS OF PROPORTIONS 
or THERAPIST DIRECTIVE AND NONDIRECTIVE 
RESPONSES AS A FUNCTION OF CLIENT SEX 


Client sex 
Response class Male Female 
Dir M 32.6 25.9 
SD 18.8 18.1 
NDir M 46.5 53.2 
SD 20.4 19.6 


gave roughly equal proportions of Dir re- 
sponses to both male (32.6) clients and 
female (25.9) clients (¢= 1.90, df= 15, 
p> .05), they gave significantly more NDir 
responses to female (53.2) clients than to 
male (46.5) clients; # = 2.50, df= 15, p< 
.05. Similarly, while the difference between 
proportions of Dir (32.6) and NDir (46.5) 
responses given by therapists to male clients 
was not significant (¢ = 1.61, df = 15, p> 
.05), there was a significant tendency for 
therapists to give proportionally more NDir 
responses (53.2) than Dir responses (25.9) to 
female clients (t= 3.00, df= 15, p < .01). 
Although the Dom x Dir-NDir X Client Sex 
interaction was not statistically significant 
(F = 19, df = 1, p > .05), it can be seen in 
Table 1 that the primary contribution to the 
significant Client Sex x Dir-NDir interaction 
came from the LoDom therapists who gave 
comparatively few DQ (Dir) responses, as 
well as a relatively high proportion of M 
(NDir) responses, to female clients as con- 
trasted with male clients. 

Statistically reliable differences were found 
for none of the other main effects or inter- 
action comparisons. 


Discussion 


Results of this investigation, showing that 
differences in initial interview verbal behav- 
ior of therapists are related to both therapist 
personality differences and client sex, lead to 
several important implications. In addition to 
providing useful validational data for the 
ACL Dom Scale, the finding that HiDom 
therapists gave a significantly higher propor- 
tion of Dir responses and a significantly lower 
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proportion of NDir responses than LoDom 
therapists suggests possible problem areas in 
the selection and training of therapists. A 
HiDom trainee, for example, who is likely to 
respond to clients with approximately twice 
as many direct questions as a LoDom 
trainee early in therapy, may find training 
difficult in a setting which stresses a non- 
directive orientation in the psychotherapy 
process. Likewise, the LoDom trainee, who 
may be more likely to let the client assume 
major responsibility for the direction of the 
therapy interview, may experience consider- 
able frustration while training in a more di- 
rectively oriented setting. On the basis of the 
present findings, such generalizations are, of 
course, highly speculative, but the results 
suggest that further systematic research 
could be of value for enhancing the product 
of psychotherapy training programs. 

Another implication of these data is re- 
lated to the previously reported finding that 
highly directive therapist responses, such as 
DQ and FT, which permit the client little 
freedom in responding, result in significantly 
fewer client verbalizations (Carnes & Robin- 
son, 1948) and fewer statements by clients 
which reflect understanding and insight into 
their problems (Frank & Sweetland, 1962). 
Since HiDom therapists’ interview behavior 
is characterized by more Dir responses, as 
compared to LoDom therapists, the serious 
Possibility arises that there would be an in- 
verse relationship between therapist Dom and 
incidence of psychotherapy success, particu- 
larly in settings where client statements re- 
flecting understanding and insight are re- 
garded as relevant and essential to successful 
treatment outcome. In any event, evaluation 
of such a relation much await more direct 
Tesearch evidence. 

That client sex is associated with differ- 
ences in therapist verbal behavior is difficult 
to account for and may involve any of several 
Speculative explanations. The data suggest 

t inexperienced male therapists are likely 
to allow female clients, as contrasted with 
male clients, fuller responsibility for the di- 
tection of the initial interview. This is con- 
sistent with the widely held assumption that 
Male clients often find it harder to adopt a 
client role and to discuss emotional problems 
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than female clients (Heilbrun, 1961b). Par- 
ticularly early in the therapy relationship, 
male clients may need to learn that discussion 
of their emotional problems, otherwise gener- 
ally inappropriate, is desired in the kind of 
therapy relationship sought by therapists par- 
ticipating in this study. Consequently, this 
may have led to comparatively less therapist 
probing or structuring with questions and 
more following (M) behavior with female 
than with male clients. 

The present study does not permit con- 
firmation or disconfirmation of these hypothe- 
sized explanations for the observed relation- 
ships between client sex and therapist inter- 
view behavior. Moreover, only an extension 
of this study would allow assessment of the 
degree to which these findings are generaliz- 
able to later stages of therapy or to more 
experienced therapist groups. Evidence has 
been reported, for example, by Grigg (1961) 
and Bohn (1964), which indicates that level 
of therapist experience is an important vari- 
able in determining therapists’ tendency to 
control or direct the therapy session. Grigg, 
who had clients rate their therapists at the 
termination of therapy, reported that the 
more experienced therapists were seen as less 
active in beginning the interviews, less prone 
to give advice and suggestions, and less con- 
trolling of the sessions than were inexperi- 
enced therapists. Obtaining his data from a 
psychotherapy experimental paradigm, Bohn 
found that the verbal behavior of relatively 
experienced therapists, that is, male graduate 
students who were in advanced stages of 
counseling training, was less directive than 
that of inexperienced “therapists,” that is, 
male college undergraduates. Experienced 
therapists made more use of RC and CF and 
less often employed Dir responses such as 
DQ, PS, FT, and RS. Findings such as these, 
when considered with the results of the pres- 
ent study, underscore the need for continued 
research into therapist interview differences 
as a function of several dimensions including 
personality dimensions, experience, and back- 
ground differences. 

Evidence that differential verbal behavior 
of therapists is related to both psycho- 
metrically definable therapist personality 
characteristics and to client sex offers sub- 
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stantial justification for the concern expressed 
by Frank (1959) and Strupp (1960). The 
present findings strongly suggest therapists’ 
techniques are related to personality factors 
as well as situational cues, such as client sex, 
which may lead therapists to respond to 
clients in ways that are poorly understood 
and, perhaps more important, ways that may 
be detrimental to the client. 
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NOTES AND COMMENTS 


ORDINAL POSITION AND APPROVAL MOTIVATION 
GARY MORAN 


University of Indiana 


The Marlowe-Crowne Social Desirability scale, a measure of need for approval 
motivation, was administered in a group setting to a sample of university 
freshmen. The sample consisted of 80 first- and 110 latter-born males and 67 
first- and 92 latter-born females. Significant main effects for sex (p < .05) and 
birth order (p < .025) indicated that females as well as firstborn Ss showed 
higher need for approval. Separate analysis suggested that the birth-order find- 


ing was mostly attributable to females, 


Despite considerable attention, little is known 
concerning personality variables associated with 
ordinal positions (Sampson, 1965). Evidence 
does indicate that firstborns are disproportion- 
ately represented in diverse college populations 
(Schachter, 1963). Several studies indicate that 
firstborns have stronger affiliative tendencies 
when measured by TAT stories (Warren, 1966), 
although there is evidence that this finding is 
restricted to females (Dember, 1964). Other in- 
vestigations attempting to differentiate ordinal 
Positions on the basis of Allport’s values, authori- 
tarianism (Greenberg, Guerino, Lashen, Mayer, 
& Psikowski, 1963) or empathy defined as iden- 
tification with a model (Stotland & Walsh, 1963), 
have been unsuccessful. 

The present study attempted to relate need for 
approval to ordinal position. Very diverse be- 
havioral phenomena have been shown to char- 
acterize subjects high in need for approval 
(Crowne & Marlowe, 1964). In this respect, 
approval motivation represents a generalized be- 
havioral syndrome, If first- and latter borns differ 
Mm need for approval, various inferences as to the 
Personality dynamics underlying ordinal position 
effects could be drawn. 


METHOD 

The Marlowe-Crowne (M-C) Social Desirability 
Scale (Crowne & Marlowe, 1964), a measure of 
need for approval, was administered to a sample 
of introductory psychology students consisting of 
80 first- and 110 latter-born males and 67 first- 
and 92 latter-born females. Data for only children 
Were excluded due to their scarcity in the sample. 
Protocols were administered at one session in 
a large auditorium. After completing the test, subjects 
Indicated their ordinal positions by assigning a 

code to the protocols. 


RESULTS 


Mean need for approval scores are as in 
Table 1. Table 2 contains a summary of the 


analysis of variance for data presented in 
Table 1. Inspection of Table 2 indicates that 
main effects due both to sex (p< .05) and birth 
order (p < .025) were significant. In line with 
several other ordinal position experiments 
(Dember, 1964; Gerard & Rabbie, 1961; Samp- 
son, 1962; Schooler, 1964) the difference between 
first- and latter-born females is considerably 
greater than the corresponding difference for 
males. Based solely upon the analysis of vari- 
ance, the absence of a significant interaction 
term would not suggest restriction of generaliza- 
tion of the birth-order finding to females, Point- 
biserial correlations were also calculated between 
being firstborn and M-C scores for the total 
group (ris =-13; p<.025), male subsample 
(fppis = 08, p=.28), and female subsample 
(fppis = -19, p<.025). Although the subsample 
correlations do not differ significantly (p = 33), 
which fact was also reflected in the nonsignificant 
interaction in the analysis of variance, it appears 
that the birth-order finding is mostly attributable 
to females. 


Discussion 


Although diverse behavioral phenomena are 
known to be associated with order of birth 
(Sampson, 1965; Warren, 1966), little in the 
way of theory has been proposed to account 
for these contingencies, More is known about 
the approval-dependence syndrome. A summary 
of the behavioral style associated with approval- 


TABLE 1 
MEAN NEED FOR APPROVAL SCORES 


Birth 

order First Latter 
Male 12.98 12.45 
Female 14.85 12.85 
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TABLE 2 
ANALYSIS OF VARIANCE OF NEED FOR 
APPROVAL SCORES 
Source af MS F 
A (sex) 1 107.5 4.66* 
B (birth order) 1 133.33 5.78** 
AXB 1 45 1,95 
Within cell 345 23.07 
* 
"p S005, 


dependence is given by Crowne and Marlowe 
(1964). 

Certain correspondence between findings in the 
area of birth order and approval motivation can 
be identified. Schachter (1964) has demonstrated 
that firstborns seek affiliation by fraternity 
membership but tend to be underchosen by 
fellow fraternity members, Further, firstborns 
preferred association with popular peers, that is, 
their preferences were more in conformity with 
normative choices than was the case for latter 
borns. These findings are strikingly similar to 
Bank’s conclusions vis-a-vis approval-dependent 
Ss as reported in Crowne and Marlowe (1964). 
Both firstborns (Capra & Dittes, 1962; Wolf & 
Weiss, 1965) and approval-dependent Ss (Mc- 
David, 1965) volunteer more often for psycho- 
logical experiments. Finally, several investiga- 
tions reviewed by Warren (1966, pp. 43-44) 
indicate that firstborns tend to conform to 
social influence as well as to reject deviates more 
often than do latter borns. Conformity to norma- 
tive standards as a means of seeking approbation 
is a known characteristic of approval-dependent 
Ss. There are then examples of obvious overlap 
between behavioral characteristics of firstborns 
and approval-dependent persons. Moreover, sev- 
eral of the birth-order findings, particularly in 
the area of affiliation, appear to be restricted to 
females. Given this overlap in results in the two 
areas, the present study’s finding that firstborn 
females are higher in approval motivation sug- 
gests a theoretical orientation to ordinal position 
phenomena in terms of the approval-dependent 
syndrome, 

A final consideration, which cannot be an- 
swered on the basis of the present data, concerns 
the question of whether the birth-order—approval- 
dependence relationship would obtain if n Affilia- 
tion (n Aff) were partialed out. Scores on n Aff 
and M-C are associated (Crowne & Marlowe, 
1964), both reflecting a tendency to orient 


Notes AND COMMENTS 


toward others, particularly in stressful situations, 
However, n Aff is conceptualized in terms of 
concern about positive affective relationships, 
that is, friendships, while M-C scores are thought 
of as representing a response to the current 
environment in a culturally appropriate and ac- 
ceptable manner. The latter construct is less 
restricted and a more diverse body of findings 
has been coordinated with it. It may, therefore, 
prove more valuable to approach ordinal posi- 
tion effects through the more general approval 
syndrome. 
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VALIDATIONAL STUDY OF THE SELF-REPORT SCALE FOR 
PROCESS-REACTIVE SCHIZOPHRENIA 


MONTY H. JOHNSON anp HAROLD A. RIES 
Stockton State Hospital 


The self-report scale developed by Ullmann and Giovannoni (1964) to meas- 
ure process-reactive schizophrenia was correlated with ratings derived from the 
Premorbid subscale of the Phillips Prognostic Rating Scale. 2 samples totaling 
91 male schizophrenics were used. Correlations between the 2 scales were —.75 
and —.58 for the 2 samples. These correlations support the use of the self- 
report scale as a means of placing schizophrenics along the process-reactive 


continuum. 


In recent years the process-reactive continuum 
has assumed a prominent role in reducing the 
heterogeneity of the schizophrenic population. 
Numerous studies (Garmezy & Rodnick, 1959; 
Herron, 1962; Higgins, 1964) have demonstrated 
its relationships to such variables as prognosis, 
responses to learning situations, autonomic reac- 
tivity and other psychological attributes. The 
Measurement of this dimension usually has been 
based on either the Elgin (Wittman, 1941) or 
Phillips (1953) scales, Both depend on interview 
and case-history material and have the disad- 
vantage of being time consuming. Recognizing 
this problem, Ullmann and Giovannoni (1964) 
developed a 24-item self-report scale designed to 
facilitate the measurement of the process-reac- 
tive continuum. While the scale looks promising, 
the method used to develop it presents some 
questions about the scale’s predictive powers. The 
Procedure for selecting items from an inventory 
designed to predict posthospital employment was 
based on face validity as judged by three clini- 
Clans; items then were subjected to internal con- 
sistency criteria, However, no external criterion 
Was used either in the scale’s development or to 
demonstrate its predictive validity. 

he Purpose of the present investigation was 
to Compare the self-report scale with a more 
established criterion, For this purpose, the Pre- 
Morbid subscale of the Phillips Prognostic Rat- 
mg Scale (1953) was selected. 


METHOD 


Subjects 


matitety-one subjects (Ss) were selected from a 
nite acute treatment ward at Stockton State Hos- 
age All Ss were officially diagnosed schizophrenic, 
a e free of organic involvement, and had not re- 
1) °C EST for at least 1 month. All Ss were between 
od 50 years of age with a mean age of 32.1 
l ine The Ss’ mean IQ, based on the Shipley-Hart- 
“to Verbal scale, was 107.5. The first 50 Ss selected 

Mprised an initial sample; a subsequent group of 


41 Ss made up a second sample. There were no 
significant differences between the two samples on 
the above variables. 


Procedure 

The self-report scale and the Shipley-Hartford 
verbal scale were administered to groups consisting 
of 1 to 5 patients, Both case history and personal 
interview materials were used in deriving the Phil- 
lips ratings. Interviewing was done by the senior 
author on an individual basis. 


RESULTS AND DISCUSSION 


The Pearson product-moment correlation co- 
efficient between the Phillips scale and the self- 
report scale for the first sample was —.75, p< 
01. The correlation for the second sample was 
—.58, p < .01. Inspection of the combined data 
(Table 1) indicated that most of the discrepan- 
cies between the two scales cluster within the 13 
to 15 range on the self-report scale and that 
comparatively few misclassifications lie outside 
this range. Thus, it is apparent that there is a 
high agreement between the two scales at the 
extremes, When the Phillips scale is dichoto- 


TABLE 1 
Comparison oF PHILLIPS AND SELF-REPORT SCORES 


Self-report scores 


Phillips Process Reactive 
scores 
0-12 13-15 16-24 
Process 36 10 4 
16-36 
Reactive 4 12 25 
0-15 


Note.—High score on self-report is reactive; high score on 
Phillips is process. 


321 


322 


mized as indicated in Table 1, the self-report 
scale results in approximately 90% correct place- 
ments of the patients used in the present samples. 

Despite the relative weakness noted in the 13- 
15 score range of the self-report scale, the highly 
significant correlations between it and the Phil- 
lips scores would seem to justify the use of this 
scale for differentiating schizophrenics along the 
process-reactive dimension. 
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EFFECT OF TRANQUILIZERS ON THE TRAIL MAKING TEST 
WITH CHRONIC SCHIZOPHRENICS 


STANFORD H. SIMON 


Veterans Administration Center, Wood, Wisconsin 


46 chronic schizophrenics with at least 3 years of continuous hospitalization, 
‘no secondary diagnosis of brain damage, age under 59, and all stabilized on 
their current medication were tested twice on the Trail Making Test (TMT) 
with 6 wk. between testings. The experimental group (N= 28) had the 2nd 
testing, following 5 weeks of being off all drugs. Results show: (a) previous 
findings that TMT is not a sensitive test for organicity with schizophrenics 
are true whether or not Ss are on tranquilizers, (b) no relationship between 
amount of drugs and performance on TMT, and (c) drug withdrawal did not 


affect performance on TMT. 


While the Trail Making Test (TMT) has been 
shown to be quite sensitive in differentiating 
groups with brain damage from nonpsychotic 
controls (Armitage, 1946; Davids, Goldenberg, 
& Laufer, 1957; Reitan, 1955, 1958), studies in 
recent years have shown that chronic schizo- 
phrenics tend to score predominantly in the 
“brain damage” range (Brown, Casey, Fisch, & 
Neuringer, 1958; Knox & Whaley, 1963; Smith 
& Boyce, 1962; Tate, 1964). All of these studies 
have been done since the advent of tranquilizers, 
with almost no attempt to control for possible 
effects of such drugs. Since schizophrenics are 
typically put on fairly high dosages of medica- 
tion relative to that of nonpsychotics, the fol- 
lowing study was undertaken to assess the effects 
of tranquilizing drugs on schizophrenic TMT 
performance, 


METHOD 


The Veterans Administration Cooperative Chemo- 
therapy Project No. 14 (CPT No. 14) provided the 
occasion for this study. That project’s procedure M- 
cluded the withdrawal of current medication for 4 
6-week period. Patients had to meet the following 
criteria: primary diagnosis of schizophrenia, no sec 
ondary diagnosis of brain damage, under 59 yeas 
of age, at least 3 years of continuous current hos- 
pitalization, and stabilization on current medication: 
Fifty patients, all meeting these criteria were M- 
cluded in the present study, 30 in the experimen! 
group (CTP No. 14), and 20 in a control group 
(without drug withdrawal). Because of being saa 
testable or released from the hospital, four patien 
did not complete this study. This left the oe 
mental group with 28 subjects (mean age: ee 
range: 28-54) and control group with 18 BE 
(mean age: 40.5, range: 32-52), respectively. al 
tients were tested twice with a 6-week interv 


tal 
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between testings. The experimental group was 
tested prior to the start of CTP No. 14 and after 5 
weeks of drug withdrawal (during this time these 
patients were on an inert placebo). The TMT was 
administered in accordance with the manual? pre- 
pared by Reitan in 1959. 

Most of the patients had been on a combination 
of Mellaril (thioridazine hydrochloride) and Stela- 
zine (trifluoperazine). Using the average of criteria 
suggested by three psychiatrists 2 independently, for 
mild, moderate, and heavy doses relative to levels 
prescribed for nonpsychotics, the following percent- 
age of patients fell at these drug levels: mild, 4.4%; 
moderate, 19.6%; heavy, 76%. The daily dosage 
levels ranged from 25 milligrams of Mellaril to 400 
milligrams of Mellaril plus 90 milligrams of Stela- 
zine; the median ranked dosage was 200 milligrams 
of Mellaril plus 30 milligrams of Stelazine. 


RESULTS AND DISCUSSION 


The means and SDs of total time (Part A+ 
Part B of TMT) of both testings for the ex- 
perimental group were mean = 288.4, SD = 122, 
and mean = 254.8, SD = 131.9; and for the con- 
trol group were mean = 275.97, SD = 91.9 and 
mean = 262.5, SD = 120.8. An analysis of vari- 
ance for repeated measures indicated no signifi- 
cant differences between groups, F (1, 44) = .19; 
between tests, F (1, 44) = 2.88; nor for the 
interaction, F (1, 44) =.42. The low scores, all 
but one of which are well within the brain-dam- 
aged range suggested by Reitan (1955), do not 
appear to be related to the tranquilizer drugs 
that the patients were on. By the same token, 
removal from drugs did not significantly affect 
the performance on the TMT. 

The patients were rank ordered in terms of 
their dosage levels by considering the potency of 
Stelazine as 20 times greater than Mellaril (as 
Suggested by the dosage conversion table pub- 
lished by Sandoz Pharmaceuticals). A Spearman 


1 Reitan, R. M. A manual for the administration 
and scoring of the Trail Making Test, prepared at 
Indiana University and available from the author. 

? The author wishes to express his appreciation to 
E Leitschuh, M. Primakow, and E. Seno for their 
Cooperation. 
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rank correlation coefficient between the amount 
of drugs and total time (Part A+ Part B of 
TMT) on first testing for the experimental and 
control subjects combined was r,=.05, giving 
evidence that there is no relationship between 
the amount of drugs and TMT scores for these 
patients, 

As a final comment, the test-retest reliabilities 
for these patients were: experimental group, r 
= .587; control group, r=.708. The difference 
between groups is not significant. As far as the 
author knows, these are the first test-retest cor- 
relations published for the TMT using the cur- 
rent instructions, 
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WHAT KINDS OF ANXIETY DOES THE TAYLOR MA MEASURE? ! 


EVELYN CRUMPTON, HARRY M. GRAYSON, anb PATRICIA KEITH-LEE 


Brentwood Hospital, Veterans Administration Center, Los Angeles 


This study is concerned with the relationships of the Taylor Manifest Anxiety 
(MA) scale with the scales and individual items of the Brentwood Mood Scale 
in a group of 159 hospitalized chronic psychiatric patients. 3 types of anxi- 
ety—subjectively experienced fear, anxiety expressed in physical tension, and 
generalized uncertainty—were found to be related to scores on the MA, with 
uncertainty less related than either subjectively or physically felt anxiety. The 
MA was also found to be related to subjective feelings of depression and to 
an absence of positively toned emotional reactions. The MA was inversely 
correlated with words connoting drive level. 


The aim of this study was to further the under- 
standing of an important research and clinical 
tool, Taylor’s (1953) Manifest Anxiety (MA) 
scale, 

The MA was developed as a research instru- 
ment, specifically to test hypotheses derived from 
the Hull-Spence S-R drive-reduction theory, and 
it was intended to identify extreme groups dif- 
ferentially disposed to be driven by anxiety in a 
stress situation. But clinicians and clinical re- 
searchers put the MA to work at many more 
tasks than the one for which it was designed. 
The MA is now routinely used to indicate the 
level of consciously admitted anxiety of the sub- 
ject, and there is a considerable body of litera- 
ture indicating that such use is as justified as the 
use of any other questionnaire method. Thus, for 
clinical purposes, it becomes important to find 
some answers to the question: What kinds of 
anxiety does the MA measure? 


METHOD 


This study compares MA scores with responses to 
the Brentwood Mood Scale (BMS) in a sample of 
159 hospitalized psychiatric patients. 

The Brentwood Mood Scale is a 72-item word list 
for self-description, developed empirically in previ- 
ous studies by these investigators and others ( Crump- 
ton & Wine, 1964, 1965).2 Each adjective on the 


+This study was conducted with the cooperation 
of the following: Psychology Service, Brentwood 
Hospital, Veterans Administration Center, Los An- 
geles; the Department of Psychiatry, UCLA Center 
for the Health Sciences; and the Department of 
Psychology, UCLA. Statistical analysis was per- 
formed by the Statistical Unit, Psychology Service, 
Brentwood Hospital. This research was reported in 
a paper read at the California State Psychological 
Association, San Francisco, January 1966. 
ie Unpublished reports are available from the au- 

ors. 
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checklist has been found to be easily understood, 
appropriate for description of present emotional 
state, and used frequently enough by both normals 
and psychiatric patients to make possible statistical 
analysis with samples of practical size. 

Subjects were 159 hospitalized male veteran pa- 
tients with chronic psychiatric illnesses. Diagnoses 
were primarily functional psychoses, with a fair 
number of psychoneuroses and a scattering of acute 
and chronic brain syndromes, All of the patients 
had been in the hospital for at least 6 months and 
were expected to remain for at least 4 months, 
Average age was 44 years, with a range of 20 to 69. 
Mean educational level was high school graduate. 

The two instruments were administered to pa- 
tients in groups of 8 to 10. The MA was consistently 
administered before the BMS, and the two wert 
separated by three other tests. 


RESULTS AND DISCUSSION 


A correlational analysis was made of MA 
scores compared with BMS scores and responses 
to each BMS item. 

The BMS Sum-Fear Percent is the percentage 
of fear words ê to the total number of responses. 
It should be noted that the term “fear” is used 


Words were assigned to categories (fear, good, 
reduced drive, etc.) according to the judgment 0 
the investigators based on dictionary definitions. In- 
terrater reliability of judgment was checked by $ê- 
curing judgments from a group of seven staff psy 
chologists and three psychology research assistants 
(P) and from a second group of 16 junior college 
students in a rapid reading class (JC). Percentages 
of agreement ranged from 57% to 100%. For the 
categories referred to here, the following percentage 
of agreement were obtained; fear—P, 80%, JG; 
57%; positive emotional state or good—P, 100% 
JC, 98%; depression—P, 89%, JC, 80%; psychoti 
—P, 96%, JC, 65%; anger—P, 90%, JC, 71%. 1 
every instance the obtained agreement far exceede } 
chance expectancy. | 
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in an atheoretical sense, in its dictionary defini- 
tion of “an unpleasant often strong emotion 
caused by anticipation or awareness of danger. 
... [Of its synonyms] fear is the general term 
and implies anxiety and usually loss of courage.” 
The 16 words comprising the Sum-Fear Percent 
scale were selected as describing fear states 
and divided into three subscales, identified as 
Fear I (subjective), Fear IL (physical), and 
Fear III (indecision or uncertainty). 

The Sum-Fear percent was correlated .59 with 
the total MA score, this value (which is well 
beyond what is required for the .001 level of 
significance) indicating a positive relationship of 
moderate size between the two variables, The 
subscale Fear I (subjective) was correlated .43 
with the MA, Fear II (physical) .45, and Fear 
II (uncertainty) .26 (the latter subscale being 
significantly related at the .01 level with the 
MA, but significantly less related than either of 
the other two). 

The analysis was repeated using simple fre- 
quency scores instead of percentages. The size 
of the correlations was essentially the same. 

Of the 16 adjectives comprising Sum-Fear, 
15 were correlated with the MA at the .01 level 
or better, only self-conscious having no relation- 
ship. The most highly correlated individual ad- 
jective was tense, with an r of .56, closely 
followed by nervous with an r of .53. Other fear 
words related to the MA score at the .01 level 
or better were, in the order of the size of the 
Correlation coefficient: fearful, insecure, afraid, 
upset and anxious (tied), indecisive, undecided, 
restless, timid, unsettled, shaky, and panicky and 
uncertain (tied). 

These results confirm the MA score as a reflec- 
tion of the level of self-described anxiety, es- 
pecially as expressed in physical tension and 
subjectively experienced fearfulness. 

The correlational analysis was done also for 
the remainder of the BMS words. It was antici- 
pated that the MA would be inversely related to 
words connoting a positive emotional state. As 
expected, 13 of the 14 “good” words were 
Significantly negatively correlated with the MA, 
only contented showing no relationship. The word 
well, with an r of —.70, had a higher correlation 
Coefficient than any other word, even the fear 
Words, 

Considering the theoretical basis of the MA, 
One might expect a high MA to go with self- 
descriptions suggestive of high drive level, but 
results showed exactly the opposite, The MA 
Was positively correlated with inactive, weary, 
listless, weak, slow, sluggish, and wornout; nega- 
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tively correlated with alert and active; and not 
correlated at all with excitable. 

In this population, the MA score appeared to 
reflect the level of subjectively experienced feel- 
ings of depression about as well as it did the 
level of anxiety. The MA was correlated .56 
with the word depressed and significantly cor- 
related with 10 of the remaining 11 words con- 
noting depression; hopeless, unhappy, miserable, 
lonely, useless, moody, unsatisfied, blue, worth- 
less, and empty; only sinful was not significantly 
related to MA. 

Although MA was significantly correlated with 
the word confused, it showed no relationship 
to any of the other words indicating a psychotic 
state of mind. MA was correlated with three of 
the eight words connoting anger (irritable, 
annoyed, and violent). 

It is interesting that MA was inversely related 
to the word well, with the highest correlation 
coefficient of any (—.70) but not related at all 
to the word sick, which had an r of .08. Ap- 
parently these psychiatric patients equate an 
absence of anxiety with being well, but consider 
the presence or absence of anxiety irrelevant to 
being sick. This finding, incidentally, confirms 
those of other studies by the senior author and 
co-workers (Crumpton & Groot, 1964; Crump- 
ton, Weinstein, Acker, & Annis, 1963; Crumpton 
& Wine, 1964, 1965) which suggest that mental 
patients do not accept the euphemism “sick” as 
having any application to themselves. 

On the whole, the results affirm the construct 
validity of both scales, as well as provide new 
evidence for the interpretation of the MA score. 
The MA appears to measure, primarily, the 
level of consciously experienced anxiety and 
physical tension, subjective feelings of depres- 
sion and reduced energy level, to a lesser extent 
the milder feelings of indecision and uncertainty, 
and certainly the absence of positively toned 
emotional states. One would hope for a purer 
measure of anxiety, but it may be that in a 
hospitalized psychiatric sample anxiety and 
depression are hopelessly intermingled. 

The finding that MA is related to reduced 
rather than increased drive level has implications 
not only for the clinical use of the MA, but 
also for the research use for which it was de- 
veloped. Separating subjects into extremes of 
drive level, as has been done in many experi- 
ments using normal subjects, would certainly give 
misleading results if done in the same way with 
this psychiatric sample. It is reasonble to think 
that psychiatric patients with considerable anx- 
iety might also develop psychological paralysis 
from fear of increasing that anxiety, whereas 
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normal subjects could more easily tolerate 
being driven by their anxiety. It would be inter- 
esting to see if there really is this interaction 
between drive level and psychiatric status af- 
fecting the MA, One could select the extremes 
of drive level (as indicated by the BMS or by 
some other questionnaire method) in both a 
normal sample and a psychiatric sample, using 
the MA as the dependent variable in a double- 
classification analysis of variance, and test for 
interaction, 
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INTELLECTUAL DIFFERENCES BETWEEN SUBURBAN 
PRESCHIZOPHRENIC CHILDREN AND THEIR SIBLINGS + 


ANNA SCHAFFNER 
Fairhill Psychiatric Institute 
ELLEN A. LANE ayo GEORGE W. ALBEE 


Western Reserve University 


The intelligence test scores of 56 suburban school children who later became 
schizophrenic adults were found to be significantly lower than the intelligence 
test scores of their siblings on tests taken at the same ages. Early intellectual 
deficit of future schizophrenics which is not easily discernible in suburban chil- 


dren, whose intelligence test scores often are 


above average but near the 


average of their own schools, shows up in comparisons with the performance 


of their own siblings, 


Earlier studies (Lane & Albee, 1964, 1965) 
have shown that children who later become 
schizophrenic adults scored significantly lower on 
intelligence tests during second, sixth, and eighth 
grades than their own nonaffected siblings. The 
Previous studies were based on samples of schizo- 
phrenics who had come from central city back- 
grounds and generally from low socioeconomic 
levels. While the scores on intelligence tests of 
children from these neighborhoods were below 
average, the preschizophrenic children actually 
scored below their own neighborhood school 
norms (Albee, Lane, & Reuter, 1964). 


+The research reported in this Paper was sup- 
ported by Grant M-5186 from the National Institute 
of Mental Health, United States Public Health 
Service, Department of Health, Education, and 
Welfare, 


The purpose of the present study was to de- 
termine whether schizophrenics-to-be, growing up 
in middle and upper class suburbs, could be 
distinguished (in the same way as the urban 
sample) from their own nonaffected siblings on 
the basis of performance on intelligence tests 
given in grade school. In contrast to the urban 
groups of future schizophrenics, these suburban 
children, who later developed adult schizophrenia, 
came from areas where the average intelligence 
in the school was above the national average and 
where motivation and school achievement have 
always received strong emphasis. 


METHOD 


Lists of names of adults diagnosed as schizophrenic 
at three state hospitals and one large university 
hospital were checked at the attendance bureaus © 


Notes AND COMMENTS 


four suburban public school systems in the greater 
Cleveland area. From these sources, the names of 
all siblings who had attended these schools were 
also obtained. 

All intelligence test scores (both group and indi- 
vidual tests) that were available for each subject 
and his siblings were recorded. Because it was found 
that 14 different intelligence tests had been used, 
each was normalized to a mean of 100 and a stan- 
dard deviation of 15, based on the standardization 
sample for each test. In this way, a z score was 
computed for each subject’s and each sibling’s test 
scores, making the different tests comparable. 

The final sample consisted of 56 suburban chil- 
dren, who were future adult schizophrenics, and their 
siblings. The total number of siblings was 112. 

Comparisons were made between subjects (chil- 
dren who later became adult schizophrenics) and 
their siblings at three different childhood levels of 
schooling: kindergarten through third grade, fourth 
grade through sixth grade, and seventh grade through 
ninth grade. The Ns varied because not all subject- 
sibling pairs had taken tests during all these periods. 
If a subject or a sibling had been tested more than 
once within any of these intervals, his mean IQ 
was used. When more than one sibling had been 
tested within the time interval, the mean IQs were 
averaged. In this way a single IQ score for each 
subject and for each sibling or set of siblings was 
used for each particular comparison. 


RESULTS AND DISCUSSION 


Table 1 shows that children from the suburbs 
who later became schizophrenic adults, like pre- 
schizophrenics from the central city, score lower 
than their own siblings on intelligence tests given 
in public school. These differences reach sta- 
tistical significance at two of the three periods 
compared. There is a strong tendency in the 
same direction for the third comparison. 

It is interesting that the preschizophrenics 
show this intellectual deficit when compared to 
their siblings, in spite of the fact that their own 
IQ scores are above average. Unlike the pre- 
schizophrenics from the central city, these chil- 
dren are not lower than the national averages 
hor lower than their own peers in school (Schaff- 
her, 1964), They did not come from intellectually 
inferior families within the suburbs. Indeed, 
the nonaffected sibling members of the families 
in this sample score as a group almost a full 
Standard deviation above national averages. 
Without this comparison with their siblings, it 
would be difficult to conclude that preschizo- 
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TABLE 1 


INTELLECTUAL PERFORMANCE OF SUBURBAN 
PRESCHIZOPHRENICS AND THEIR SIBLINGS 


Grade in } Schizophrenic | Sibling 
Behoo! N Pairs |-— j] t 
IQ SD| IQ SD 
K-3 25 106.6 17.1 |111.0 16.0} .05 
46 26 107.6 20.5 | 111.1 14.5 | ns 
7-9 32 107.7 16.3 | 114.3 16.8} .01 


phrenics from the suburbs evidence any intel- 
lectual deficit discernible on childhood IQ tests. 
Nevertheless, they are considerably lower in per- 
formance than would be predicted on the basis 
of their siblings’ scores. This finding, in addition 
to those of the original studies, suggests the 
general conclusion that schizophrenics-to-be score 
lower than their own siblings, no matter what 
their socioeconomic level. There is evidence of 
intellectual deficit in children who later develop 
schizophrenia, long before the actual onset. 

The question of the relationship between 
schizophrenia and intellectual deficit in child- 
hood, however, has not been settled by these 
results, Whether an ongoing process of schizo- 
phrenia causes lowered intellectual performance 
due to decreased motivation, or whether that 
child with the lowest IQ in the family, other 
necessary conditions being present, is the most 
likely to develop schizophrenia, will haye to be 
determined by future research. 
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DIMENSIONS OF PSYCHIATRIC PATIENT WARD BEHAVIOR 


LEE GUREL1 
Veterans Administration, Washington, D. C. 


Principal factor Varimax rotation factor analyses were performed on 7 sets of 
ward behavior ratings collected during a 4-yr. follow-up of 1,274 functional 
psychotics. 4 factors were identified: Withdrawal-Apathy, Hostility-Resistive- 
ness, Deteriorated Appearance, and Schizophrenic Disorganization. The sta- 
bility of these factors and their correspondence with results of other factor 
analyses in the literature suggest their conceptual meaningfulness for further 


use. 


Lorr, Klett, and McNair (1963), in the course 
of their own important contribution, stress the 
paucity of empirical findings in the area of factor 
analytically derived syndromes of psychopathol- 
ogy. The purpose of this paper is to present 
data on a set of factor dimensions useful in 
conceptualizing the ward behavior of hospitalized 
functional psychotics. It parallels in design and 
scope the recent work of Cohen, Gurel, and 
Stumpf (1966) on dimensions underlying repeated 
symptom ratings, 


METHOD 
Subjects 


As part of the larger Veterans Administration 
Psychiatric Evaluation Project (PEP) study of the 
effectiveness of psychiatric treatment facilities 
(Psychiatric Evaluation Project, 1964), 1,274 func- 
tional psychotics, 95% of whom were schizophrenic, 
were repeatedly rated on the PEP Behavior Rating 
Scale (BRS). The sample included virtually all 
patients who met selection criteria during approxi- 
mately 1 year of intake (August 1956-September 
1957) at 12 participating hospitals. Subjects were 
male and under 60, were without serious physical 
illness, known organic brain disease, or psycho- 
surgery, had not been hospitalized in a psychiatric 
facility for as much as 90 of the 180 days prior to 
admission, and remained alive during a 4-year study 
period dating from admission, 


Data Collection 


The BRS was completed at admission (admission 
BRS; N =1,274) and when patients first exited the 
hospital within 2 years of admission (exit BRS; 
n=976), Ratings were also made on patients who 
were physically in the hospital 6 months, 1 year, 
2 years, 3 years, or 4 years after admission (re- 


1 Although appearing under the authorship of the 
Present director, this paper represents the combined 
effort of all PEP staff, We are particularly in- 
debted to the anonymous nursing service personnel 
who did the rating; Richard L. Jenkins, former 
Project director, under whose aegis most of these 
data were collected; Jacob Cohen, chief consultant 
to PEP, for invaluable statistical assistance; and 
Nancy E. Rains for bibliographic research, 


evaluation BRS; n=637, 479, 366, 356, and 330, 
respectively). In all, then, seven sets of ratings were 
completed, the first on the total sample of 1,274 
and the last six on subsamples of the total, Ratings 
were made by specifically assigned nurses or nursing 
assistants on the basis of patient behavior during 
the 2 days preceding rating. 


Behavior Rating Scale 


The 20 items of the BRS® are listed in Table 1. 
It can be seen that some of the behaviors measured 
are minimally relevant to the contemporary mental 
hospital scene. The means and standard deviations 
in Table 1 confirmed the impressions gained during 
data collection of infrequent pathological response 
to some items and, therefore, sharply restricted vari- 
ance. It is to be noted that the initial development 
of the BRS and the decision to use it in PEP 
were accomplished before the milieu changes which 
were effected in many psychiatric hospitals during 
the early and mid-1950’s. Although the present 
analyses, by virtue of favorable sampling and 
lengthy follow-up, are instructive in clarifying some 
of the dimensions underlying ward behavior, several 
more recently developed instruments (Aumack, 1962; 
Ellsworth, 1962; Honigield & Klett, 1965) would 
probably be more relevant for present-day clinical 
or research use. 


Data Analysis 


Using an IBM 7094 computer, each of the seven 
sets of BRS protocols was intercorrelated, and the 
resulting 20 X 20 matrices of intercorrelations were 
factored by the principal factor solution (Harman, 
1960, Chap. 9), first with one’s in the diagonal and 
again with squared multiple correlations (SMCs) 
in the diagonal. Using Kaiser’s normal Varimax 
method (1958), four and five or five and six factors, 
depending on latent root size, were rotated for each 
of the seven sets of BRS data. As noted below, 
Item 17, “maintains manneristic postures or move- 
ments,” was judged to be a poor item and did not 
fall with any of the interpretable factors obtained. 


2 Grateful acknowledgment is made to Stanford 
University Press for permission to use items from 
the Hospital Adjustment Scale and to Leo Shatin 
for permission to use items from the Albany 
Behavior Rating Scale. 
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TABLE 1 


MEANS AND STANDARD DEVIATIONS OF BEHAVIOR 
RATING SCALE ITEMS AT ADMISSION 


No. Item M SD 


Does the patient: 


. Wet or soil self 02 15 
. Know what time it is 86 34 
. Ignore what goes on around him 24 43 
. Need help in sticking to activities 4&3 59 
. Seem to be unhappy 47 50 
. Read newspapers and magazines 63 48 
. Resist when asked to do things 13 33 
. Destroy property 03017117 
. Pick fights with other patients 04 20 
. Answer sensibly when talked to 81 39 
. Swear or use obscene language 07 25 
. Leave his clothes unbuttoned 12 33 
. Give difficulty in holding attention 28 45 
. Use words that are understandable 87 34 
. Stay neat and clean 83 38 
16, Chat with other patients 59 49 
. Maintain manneristic postures or move- 


ma a pa a a a 
APANESCHNIANRWNE 


= 
a 


ments 23 42 
18. Yell at aide when he is disgusted 07 26 
19. Want to do the right thing on the ward 82 38 
20. Seem to be a good worker 61 49 


Note.—Means computed to represent percentage of “Yes” 
responses. 


In order to eliminate any possible distorting effects 
of Item 17 on the factored solutions previously 
obtained, 19X 19 matrices with Item 17 deleted 
were factored with SMCs in the diagonal, and four 
factors were rotated. In all, then, there were 
[(2X2X 7) +7] 35 rotated factor solutions. 

The similarities among the 35 rotated solutions 
were determined by computing ¢, a coefficient of 
congruence or factor similarity (Harman, 1960, 
P. 257), between all possible pairs of rotated factors. 


RESULTS AND DISCUSSION 


A detailed presentation of the results greatly 
transcends the capacity of an article and the 
interest of any but the most specialized reader.® 
Since the several factor solutions at each of the 
Seven data-collection points were found to be 
highly similar, only the rotated five-factor solu- 
tion with one’s in the diagonal for the admission 
BRS is reproduced here (Table 2). However, 
a review of all 35 rotated matrices indicated that 
the BRS was most meaningfully interpreted as 
Comprised of four common factors, interpreta- 
tion of which can be adequately conveyed by 
the rotated matrix reproduced here. 


3A considerable body of information on the BRS 
development, measurement properties, and uses in 
PEP is contained in an extensive technical report 
available from the author (Psychiatric Evaluation 
Project, 1963). 
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Factor I—Withdrawal-Apathy 


Factor I was primarily defined by Item 3, 
ignore what goes on around him; Item 4, need 
help in sticking to activities; Item 5, seem to 
be unhappy; Item 6, (not) read newspapers and 
magazines; Item 13, give difficulty in holding 
attention; Item 16, (not) chat with other pa- 
tients; and Item 20, (not) seem to be a good 
worker. This first factor was characterized at 
the pathological (positive) extreme by with- 
drawal and seclusiveness, self-preoccupation, and 
noninvolvement and has been labeled With- 
drawal-Apathy. Item 19, (not) want to do the 
right thing on the ward, had an important sec- 
ondary loading on Factor I, and it was the only 
item which split importantly between factors. 


Factor Il—Hostility-Resistiveness 


Factor II was quite obviously a dimension of 
hostility, resistiveness, and destructiveness. It was 
defined by Item 7, resist when asked to do 
things; Item 8, destroy property; Item 9, pick 
fights with other patients; Item 11, swear or use 
obscene language; Item 18, yell at aide when he 
is disgusted; and to a lesser extent by Item 19, 
(not) want to do the right thing on the 
ward. Factor II may therefore be identified as 
Hostility-Resistiveness. 


TABLE 2 


Roratep Factor Loapincs or ADMISSION BRS: 
Frve-Factor SOLUTION WITH Unity IN DIAGONAL 


Factor 

Item 
No. I I Il IV Vv hè 
1. —02 18 62 13 —25 49 
2. —30 -04 -30 -59 -04 53 
3. 59 02 17 27 —21 49 
4, 68 16 08 20° =11 "55 
5. 64 15 067 —=15 00 46 
6 —72 -01. -14 —13 00 55 
E 37 58 03 12 —04 49 
8. —02 60 40 04 —20 57 
9. 04 71 19 02 —11 56 
10. —27 -24 -12 -7 06 67 
11. 01 70 13 07 0 51 


12. 25 20 74 15 04 68 


1a ao A NIE 13A 38y — 14 | 5A 
4-05 + —-06 -14 -77 11 63 
15. —26 —09 73 —23 —14 68 
ie SEa «09 12 08 06, 56 
17. 16 10 O4 11 89 84 
18. 06 82 -02 OO 02 69 
19. —41 —52 -09 —34 -10 57 
20. —66 -16 O01 —27 —10 54 


Note.—Decimals preceding all values omitted. 
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Factor I1I—Deteriorated Appearance 


Factor III suggested a dimension of deteriora- 
tion in personal appearance and will be referred 
to as simply Deteriorated Appearance. It was 
primarily defined by Item 1, wet or soil self; 
Item 12, leave his clothes unbuttoned; and Item 
15, (mot) stay neat and clean. 


Factor I1V—Schizophrenic Disorganization 


The pathological extreme of Factor IV re- 
flected gross cognitive psychopathology. For 
want of a better term, it may be called Schizo- 
phrenic Disorganization. This fourth factor was 
defined largely by Item 2, (not) know about 
what time it is; Item 10, (not) answer sensibly 
when talked to; and Item 14, (not) use words 
that are understandable. 

What appears in Table 2 as a fifth factor is 
defined by only a single item with practically 
no shared variance, Item 17, maintains manner- 
istic postures and movements. An earlier reli- 
ability study had shown this to be the least 
reliable item in the BRS, possibly because some 
raters confused the item with the issue of good 
manners, stuporous immobility, etc. The sup- 
posedly high incidence of mannerisms reflected 
by 14-28% “Yes” responses contradicts even the 
likelihood of face validity. Therefore, no inter- 
pretation of Factor V in Table 2 has been 
attempted. 

The coefficients of congruence ($) computed 
between factors in the 35 rotations indicated a 
considerable similarity between the factor struc- 
tures obtained on the seven occasions when the 
BRS was administered. The œ values between 
manifestly similar factors were typically in the 
80’s and 90’s. As an example, the median ¢ 
between a four-factor rotation of the admission 
BRS and five-factor rotations of the other BRS 
administrations was .87. These findings point to 
an essential identity of the factor structures of 
the seven sets of BRS data, an identity which 
is the more impressive in view of the 4- to 5- 
year time span over which ratings were col- 
lected, the many raters involved, and the use of 
dissimilar, albeit overlapping, subsamples, The 
Withdrawal-Apathy, the Hostility-Resistiveness, 
and the Deteriorated Appearance factors were 
particularly stable. The Schizophrenic Disorgani- 
zation factor was somewhat less stable, prob- 
ably because of the low variances of the items 
defining it. 

Confidence in the meaningfulness of the factor 
structure described above is increased by cor- 
respondence of the factors identified to dimen- 
sions described in other factor analyses of ward 
behavior of psychotics, Factor I, Withdrawal- 
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Apathy, appears to be similar to the Withdawal 
dimension described by Lorr and O’Connor 
(1962), to Raskin and Clyde’s (1963) Social 
Participation, and to the second-order factor 
which Lorr et al. (1963, p. 59) posit to account 
for the three factors found by Guertin (1955); 
it corresponds especially closely to the second- 
order Withdrawal more recently reported by 
Lorr, Klett, and McNair (1964). The second 
factor, Hostility-Resistiveness, is particularly fa- 
miliar. A similar dimension appears to have been 
identified by Guertin (1952) as Excitement- 
Hostility, as Emotional Controls by Guertin and 
Krugman (1959), as Psychotic Belligerence by 
Wittenborn (1962), as Irritability by Raskin and 
Clyde (1963), and as a “higher level Hostile 
Belligerence” by Lorr et al. (1964, p. 295). 
There was a suggestion in some of the matrices, 
however, that, with a sufficient number of items, 
the present Hostility-Resistiveness could be 
meaningfully separated into two factors; these 
would then be comparable to the Resistiveness 
and Hostile Belligerence identified by Lorr and 
O'Connor (1962). Similarly, the present Schizo- 
phrenic Disorganization would seem to parallel 
Lorr’s first-order Thinking Disorganization (Lorr 
& O'Connor, 1962) and to be central to his 
second-order Schizophrenic Disorganization (Lorr 
et al., 1964). Factor III, Deteriorated Appear- 
ance, has no parallel in Lorr’s work, but it is 
recognizably similar to Guertin and Krugman’s 
Deteriorated Behavior (1959), to Wittenborn’s 
Hebephrenic Negativism (1962), and by virtue 
of partial item overlap is similar to Honigfeld 
and Klett’s Personal Neatness (1965). 

In brief, all four of the dimensions identified 
here resemble others reported in the literature. 
This result in combination with the finding of 
relative invariance over the seven sets of analyses 
strongly suggests the conceptual meaningfulness 
of the four factors identified. 


REFERENCES 


Aumacx, L. A social adjustment rating scale. Jour- 
nal of Clinical Psychology, 1962, 18, 436-441. 
Comen, J., Gurex, L., & Srumer, J. Dimensions of 
psychiatric symptom ratings determined at thirteen 
timepoints from hospital admission. Journal of 
Consulting Psychology, 1966, 30, 39-44. 

Exisworta, R. B. The MACC Behavioral Adjust- 
ment Scale. Los Angeles: Western Psychological 
Services, 1962. 7 

Guertin, W. H. A factor-analytic study of schizo- 
phrenic symptoms, Journal of Consulting Psychol- 
ogy, 1952, 16, 308-312. 2 

Guertin, W. H. A factor analysis of schizophrenic 
ratings on the hospital adjustment scale. Journal 
of Clinical Psychology, 1955, 10, 70-73. 


NOTES AND COMMENTS 


Guertin, W. H., & Krucman, A. D. A factor 
analytically derived scale for rating activities of 
psychiatric patients. Journal of Clinical Psychol- 
ogy, 1959, 15, 32-35. 

Harman, H. H. Modern factor analysis. Chicago: 
University of Chicago Press, 1960. 

Hontcretp, G., & Kurtt, C. J. The nurses observa- 
tion scale for inpatient evaluation: A new scale 
for measuring improvement in chronic schizo- 
phrenia. Journal of Clinical Psychology, 1965, 21, 
65-71. 

Karser, H. F. The Varimax criterion for analytic 
rotation in factor analysis, Psychometrika, 1958, 
23, 187-200. 

Lor, M., Krerr, C. J, & McNam, D..M. Syn- 
dromes of psychosis. New York: Pergamon, 1963. 

Lorr, M., Krerrt, C. J, & McNam, D. M. Ward- 
observable psychotic behavior syndromes. Educa- 
tional and Psychological Measurement, 1964, 24, 
291-300. 


Journal of Consulting Psychology 
1967, Vol. 31, No. 3, 331-332 


331 


Lorr, M., & O'Connor, J. P. Psychotic symptom 
patterns in a behavior inventory. Educational and 
Psychological Measurement, 1962, 22, 139-146. 

PSYCHIATRIC EVALUATION Project. Intramural report 
63-1: Dimensions of psychiatric patient ward be- 
havior. Washington, D. C.: Veterans Administra- 
tion Psychiatric Evaluation Project, 1963, 

Psycuratric EVALUATION Project. Intramural report 
64-5: An assessment of psychiatric hospital effec- 
tiveness. Washington, D. C.: Veterans Adminis- 
tration Psychiatric Evaluation Project, 1964, 

Rasxin, A., & Crype, D. J. Factors of psycho- 
pathology in the ward behavior of acute schizo- 
phrenics. Journal of Consulting Psychology, 1963, 
27, 420-425. 

WITTENBORN, J. R. The dimensions of psychosis. 
Journal of Nervous and Mental Disease, 1962, 
134, 117-128. 


(Received April 25, 1966) 


COMPARISON OF THE WISC AND WAIS AT CHRONOLOGICAL 
AGE SIXTEEN 


ROBERT T. ROSS AND 


California State Department of Mental Hygiene 


JUNE MORLEDGE 
Sonoma State Hospital, California 


A group of 30 Ss was tested with the WISC and 4 wk. later with the WAIS. 
During this interval they all passed their 16th birthdays. Since chronological 
age is constant, correlations were calculated for the various IQ scales of the 2 
tests and indicated that IQs obtained at age 16 from the 2 scales are highly 
comparable. The mean IQs and standard deviations of the experimental groups 
were not significantly different from the mean IQs and standard deviations of 
the standardization groups. In the case of the Full Scale 1Q, differences in 
individual Ss ranged from —11 to +13 points with a mean at +24 points 
(WAIS — WISC). In general, the results indicate that the transition from the 
WISC to the WAIS at age 16 introduces no significant errors in IQ determina- 


tion. 


In many circumstances, particularly in schools 
and institutions, students or patients are tested 
one or more times before the age of 16 with the 
Wechsler Intelligence Scale for Children (WISC) 
and, subsequently, one or more times with the 
Wechsler Adult Intelligence Scale (WAIS). Since 
the WISC is applicable up to the age 15 years, 
11 months and the WAIS from 16 years, 0 
Months on up, the question naturally arises as to 
whether the two scales “meet” at chronological 
age 16 years, 0 months, This would, of course, be 
expected since presumably both tests are stand- 
ardized against 16 year olds, the WISC at its 
upper extremity and the WATS at its lower limit. 


PROCEDURE 


In order to test the comparability of the IQs 
derived from the two scales at chronological age 16, 
a group of 15 males and 15 females was selected and 


tested first with the WISC and 4 weeks later with 
the WAIS. At some time during the 4-week interval 
each individual passed his 16th birthday, 


RESULTS 


Full Scale IQs on the WISC varied from 64 
to 146 with a mean of 99.6 and a standard devi- 
ation of 18.8. The Full Scale IQs on the WATS 
varied from 71 to 138 with a mean of 102.0 and 
a standard deviation of 15.7. These figures indi- 
cate that the group was reasonably comparable to 
the standardization groups which had average IQs 
of 100 and standard deviations of 15. The corre- 
lation between these two sets of IQs was .963 
and the mean difference between them (WAIS — 
WISC) was 2.4 with a range from —11 points to 
+13 points. 

The IQs determined by the two tests are com- 
pared in Table 1. The mean difference between 
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TABLE 1 
STANDARD Scores, RELIABILITIES (Test-RETEST), AND IQs oN THE WISC anp WAIS 
FOR 30 SIxTEEN-YEAR-OLD SUBJECTS 
WISC WAIS 
——— WISC* WAIS? 
Scale M SD M SD r alne alone Maitt. 
Intelligence quotients 
Verbal 96.8 18.9 100.3 16.3 95 = — +3.5 
Performance 102.7 17.8 103.7 14.4 92 = F +1.0 
Full 99.6 18.8 102.0 15.7 96 a 7 +24 
Standard scores 

Verbal 47.5 15.0 55.1 16.7 95 96° 96 
Performance 51.9 12.7 51.6 10.9 92 904 93 
Full 99.4 26.0 106.7 26.3 96 94° 97 
Information 9.8 3.8 9.3 3.0 84 82 91 
Comprehension 9.2 2.9 9.6 3.3 58 .71 79 
Arithmetic 9.7 3.3 9.3 3.1 96 17 79 
Similarities 9.7 4.0 9.2 3.2 90 .79 87 
Vocabulary 9.4 4.0 8.4 3.1 92 90 71 
Digit Span 9.0 3.4 9.2 3.1 17 -50 94 
Picture Completion 11.2 3.1 11.0 2.0 64 68 82 
Picture Arrangement 10.1 2.5 10.8 3.0 70 72 66 
Block Design 10.2 3.2 10.3 3.2 -90 88 86 
Object Assembly 9.9 2.3 10.3 2N. 64 71 65 
Coding 10.4 4.1 9.6 2.5 89 = 92 


* Age 13}, N = 200. 

b ane 18-19, N = 200, 

e Without Digit Span. 

4 Without ere 

© Without Digit Span or Coding. 


the Verbal IQs was 3.5 with a range from —13 to 
+21 (WAIS —WISC), while the difference be- 
tween the means of the Performance IQs was 
+1,00 with a range from —13 to +15 points. 
None of the differences between mean IQs on the 
Verbal, Performance, or Full Scales was signifi- 
cant, 

The remainder of Table 1 shows the standard 
Scores for the individual subtests on both the 
Verbal and the Performance scales and the cor- 
relations between them. It is interesting to note 
that the highest correlations among the subtests 
on the Verbal scale are for Vocabulary and Simi- 
larities (.92 and .90, respectively) and on the 
Performance scale for Block Design and Coding 
(.90 and .89, respectively). 

Table 1 indicates that “reliabilities” arrived at 
from a WISC versus WAIS correlation are, in 
general, comparable with the subscale reliabili- 
ties as given in the WISC and WAIS manuals 
(Wechsler, 1949, 1955). 


Discussion 


It would appear that changing from the WISC 
to the WAIS at chronological age 16 gives a 


highly comparable IQ, particularly for the Full 
Scale, and that comparisons of Verbal and Per- 
formance IQs are highly satisfactory. These high 
correlations must not, however, obscure the fact 
that differences as great as 13 points in a Full 
Scale IQ were encountered, although the appear- 
ance of such large differences would probably be 
a relatively rare event (standard error of mea- 
surement is 3.07). 

Since, in the practical testing situation, the 
WAIS is always administered after the WISC, no 
attempt was made to control for practice effect. 
Indeed, since the two tests were separated by 
only 4 weeks in this study, practice effects may 
be assumed to be near maximum. For the Full 
Scale IQ, however, the difference of 2.4 IQ points 
is neither statistically nor behaviorally significant. 
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A test battery assessing specific and general treatment effects was readministered 
to Ss previously assessed before treatment, after treatment, and at a 6-wk. 
follow-up from groups undergoing individual programs of (a) modified sys- 
tematic desensitization, (b) insight-oriented psychotherapy, (c) attention- 
placebo treatment, and (d) no treatment. Higher return rates were obtained 
than in any previous long-term follow-up, revealing maintainance of im- 
provement found earlier for interpersonal performance anxiety. Systematic 
desensitization resulted in the greatest significant improvement (85%), fol- 
lowed by insight-oriented psychotherapy and attention placebo (50% each), 
and untreated controls (22%). Changes were reliable, predictable, and showed 
evidence of further generalization. No evidence of relapse or symptom sub- 
stitution was obtained, although they were specifically sought. Methodological 


problems of follow-up studies are also discussed. 


After a review of the difficulties of follow- 
up studies on psychotherapy, Sargent (1960) 
concluded that, “the importance of follow-up 
is equalled only by the magnitude of the 
methodological problems it presents.” In the 
absence of a carefully designed outcome 
study on which to base follow-up investiga- 
tions, the follow-up may be doomed from the 
start. Thus, in many studies, the methods of 
assessment at follow-up differ from those at 
pretreatment and posttreatment (e.g., Berle, 
Pinsky, Wolf, & Wolff, 1953; Cowen & 
Combs, 1950; Sinett, Stimput, & Straight, 
1965). Other studies, especially of a retro- 
Spective nature, have used assessment pro- 
cedures of questionable reliability and va- 
lidity (e.g., Cooper, Gelder, & Marks, 1965; 
Sager, Riess, & Gundlach, 1964; Schmidt, 
Castell, & Brown, 1965). Still others have 
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neglected to include appropriate no-treatment 
control groups for assessing change in the 
absence of treatment (e.g., Bookbinder, 1962; 
Fiske & Goodman, 1965; Rogers & Dymond, 
1954). The follow-up also suffers, inherently, 
from the uncontrolled nature of client ex- 
periences during the posttreatment period. 
This is especially important when the time 
between treatment termination and follow- 
up is considerably longer than the duration of 
treatment; environmental experiences during 
the posttreatment period may have more in- 
fluence on Ss’ status at follow-up than a 
brief program of treatment some months or 
years in the past. The greatest confounding 
comes from the fact that many Ss receive ad- 
ditional treatment of unknown nature during 
the posttreatment period, thus invalidating 
the design for determining cause-effect rela- 
tionships for the specific treatment under 
investigation. This practical problem has 
limited the value of many follow-up studies 
(e.g., Braceland, 1966; McNair, Lorr, Young, 
Roth, & Boyd, 1964; Stone, Frank, Nash, & 
Imber, 1961). 

Overshadowing all other problems of fol- 
low-up research is the practical difficulty of 
sample maintainance and attrition. Even 
adequately designed studies may not be able 
to obtain consistent follow-up data on treated 
Ss, let alone controls (e.g., Fairweather & 
Simon, 1963; Kogan, Hunt, & Bartelme, 
1953; Lang & Lazovik, 1963). The problem 
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of differential dropout and selective biasing of 
the sample cannot be ignored, since differ- 
ences have been found between follow-up 
returnees and nonreturnees (Fiske & Good- 
man, 1965), and further, as May, Tuma, and 
Kraude (1965) point out, even if differences 
are not found, nonreturnees are clearly dif- 
ferent in cooperation, mobility, or both. To 
highlight the magnitude of this problem, a 
thorough search of the literature failed to 
reveal a single study on individual treatment 
of noninstitutionalized adults which obtained 
data on all treated Ss 2 years or more after 
treatment termination, nor one which in- 
cluded an attempt to obtain such data on an 
appropriate group of control Ss. 

The present study is a 2-year follow-up of 
an earlier investigation which was presented 
as a model design for the controlled evalua- 
tion of comparative therapeutic outcome 
(Paul, 1966). In the earlier study, a modified 
form of Wolpe’s (1961) systematic desensi- 
tization was found to be significantly more 
effective in reducing maladaptive anxiety 
than insight-oriented psychotherapy or an 
attention-placebo treatment. Additionally, all 
three treated groups were found to show sig- 
nificant improvement over untreated controls. 
Although these effects were found at termina- 
tion of treatment, under stress-condition as- 
Sessment, and were maintained at a 6-week 
follow-up, the differing theoretical models 
from which the treatment techniques are 
derived make a long-term follow-up even 
more desirable than is usually the case. 

Specifically, the disease-analogy model 
underlying the insight-oriented approach to 
psychotherapy would interpret the results ob- 
tained by systematic desensitization and at- 
tention-placebo treatments as suggestion or 
positive transference—in either case, results 
which would be regarded as merely sympto- 
matic and temporary (e.g., Hendrick, 1958). 
According to this model, not only would Ss 
treated by either systematic desensitization 
or attention placebo be expected to show 
“relapse” after the “supporting contact with 
the therapist fades [Sargent, 1960],” but pos- 
sibly harmful results would also be expected 
because of the necessary occurrence of symp- 
tom substitution (see Ullmann & Krasner, 
1965). In fact, the minimal symptom-sub- 
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stitution effect expected would be an increase 
in anxiety, introversion, rigidity, or depend- 
ency (Fenichel, 1945). Additionally, some 
unsuccessful cases treated by insight-oriented 
psychotherapy might be expected to realize 
benefits at some time after treatment termi- 
nation when their “insights” have had time 
to “consolidate? (Sargent, 1960). On the 
other hand, the learning model underlying 
systematic desensitization would predict no 
greater relapse for one group than another 
after treatment termination, since relapse 
would be expected to occur only on the oc- 
casion of unusual stress or if conditions favor- 
ing the relearning of anxiety were en- 
countered. Further, this model would expect 
to find no change in behaviors that were not 
the specific focus of treatment, except through 
generalization or an increase in behavior 
previously inhibited by target behaviors. 
Thus, from the learning framework, if any 
change in anxiety, introversion, rigidity, or 
dependency were to occur at all after treat- 
ment termination, it would be in the opposite 
direction of that expected from the symptom- 
substitution hypothesis (Paul, 1966). Al- 
though the findings at 6-week follow-up 
strongly favored the interpretation of the 
learning model, with none of the results ex- 
pected on the basis of the disease model 
forthcoming, it is possible that the first 
follow-up period was too short to allow the 
expected processes to show their effects. 

In the present study an attempt has been 
made to overcome the methodological and 
practical difficulties of follow-up research 
more adequately than previous attempts. By 
starting with a well-controlled outcome study, 
the same measures of assessment could be 
obtained from Ss at a consistent interval for 
long-term follow-up as were previously ob- 
tained at pretreatment and short-term 
follow-up. Persistent effort resulted in @ 
greater return of data than has been te- 
ported before, not only for treated Ss, but for 
untreated controls as well. Additionally, 
specific frequency data were obtained t0 
allow both the exclusion of Ss receiving ad- 
ditional treatment and the assessment 0) 
life stresses and possible symptom substitu- 
tion during the posttreatment period. The 
major purpose of the present study was: (a) 
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to determine the overall comparative effects 
of the different treatments from pretreatment 
to 2-year follow-up and (0) to examine the 
relative stability of improvement from the 
6-week follow-up to the 2-year follow-up, 
particularly with regard to the questions of 
differential relapse and symptom substitu- 
tion versus generalization, as predicted from 
the conflicting theories on which the treat- 
ments were based. 


METHOD 
Subjects 


The Ss included in the present investigation con- 
sisted of three groups of 15 Ss each (10 males, 
5 females) who received individual systematic 
desensitization, insight-oriented psychotherapy, or 
attention-placebo treatment and 44 Ss (32 males, 
12 females) who composed an untreated control 
group. This included all Ss from the previous out- 
come study (Paul, 1966), except for a group of 
untreated controls who participated in a different 
therapy program in another context (Paul & Shan- 
non, 1966). At pretreatment assessment all Ss were 
undergraduates (Mdn = sophomore) enrolled in a 
required public speaking course at the University of 
Illinois, ranging in age from 17 to 24 years 
(Mdn=19), Each S was selected on the basis of 
indicated motivation for treatment, high scores on 
performance anxiety scales, and low falsification 
from a population of 380 students who requested 
treatment for interpersonal performance anxiety, as 
described in detail in the earlier report (Paul, 1966). 
Although the public speaking situation was re- 
Ported to be the most stressful condition imaginable, 
anxiety was also reported in almost any social, 
interpersonal, or evaluative situation. As a group, 
the Ss also differed from the normal student 
Population by obtaining higher general anxiety and 
emotionality scores and lower extroversion scores. 
The Ss’ degree of anxiety in performance situations 
Was strong to severe, and was reported to be of 
2-20 years duration. 


Procedure 


Pretreatment assessment consisted of the ad- 
ministration of a battery of personality and anxiety 
Scales to the students enrolled in the speech course 
the week following their first classroom speech. 
The battery was constructed specifically to assess 
focal treatment effects and to show symptom sub- 
Stitution or generalization if such processes were 
Operating. The battery thus included forms of (a) 
IPAT Anxiety Scale (Cattell, 1957); (b) Pittsburgh 
Social Extroversion-Introversion and Emotionality 
Scales (Bendig, 1962); (c) Interpersonal Anxiety 
Scales (speech before a large group, competitive con- 
test, job interview, final course examination) of 
the S-R Inventory of Anxiousness (S-R; Endler, 
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Hunt, & Rosenstein, 1962); (d) a scale of specific 
anxiety in a referenced speech performance (PRCS; 
Paul, 1966).2 Following initial selection and 
prior to treatment assignment, Ss underwent stress- 
condition assessment in which they were required 
to give a 4-minute speech before an unfamiliar 
audience which included four psychologists record- 
ing the presence or absence of 20 observable 
manifestations of anxiety during each 30-second 
period on a timed behavioral checklist. In addition, 
the palmar sweat index and pulse rate were obtained 
immediately before the stress speech, as was the 
Anxiety Differential (see Footnote 3). All Ss under- 
went stress evaluation except for an equated sub- 
group of controls initially used to evaluate the ef- 
fects of the stress-condition assessment itself. 

Following stress-condition evaluation the groups 
were formed, equating all groups on observable 
anxiety, with Ss randomly assigned to therapists. 
After a short screening interview, during which 
standard expectations were established, the treat- 
ments began—4 weeks after pretreatment assess- 
ment, Five experienced psychotherapists (of Rogerian 
and Neofreudian persuasion) worked individually 
with three Ss (two males, one female) in each 
of the three treatment groups for five sessions 
over a 6-week period. All three treatments were 
conducted concurrently, with missed sessions re- 
scheduled during the same week. Within the week 
following treatment termination, a posttreatment 
stress-condition assessment was obtained on treated 
Ss and no-treatment controls, including the same 
measures used in the pretreatment stress condition. 
The first follow-up (FU:) data were then obtained 
by a second administration of the test battery to 
all Ss 6 weeks after treatment termination. At- 
titudinal and improvement ratings were also ob- 
tained from treated Ss and therapists. The details of 
all aspects of procedure and results through FU, 
are reported in the earlier study (Paul, 1966). 

The 2-year follow-up (FUs) procedure required 
tracking down the Ss for a third administration of 
the test battery which had been administered at 
pretreatment and FU. For FUs the test battery was 
augmented to obtain specific frequency data re- 
garding the occurrence of stress during the post- 
treatment period; the frequency of external be- 
haviors which might reflect predicted symptom- 
substitution effects of increased dependency, anxiety, 
or introversion; and information concerning addi- 
tional psychological treatment or use of drugs which 
might affect S’s behavior or response to the anxiety 
scales. 

Information on external stress was obtained by 
requesting Ss to indicate the number of times each of 
a number of events occurred since the last contact 
(FU;). These events covered five major areas of 
stress: (a) illness or death of loved ones; (b) conflict 


3 The original battery also included a form of the 
Anxiety Differential (Husek & Alexander, 1963). 
‘This form was excluded from follow-up analysis 
since an additional stress administration was not 
obtained. 
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(with fiancé or spouse, with persons in authority) ; 
(c) change in family structure (engagement, mar- 
riage, separation, divorce, pregnancy, or birth) ; 
(d) personal illness or accident; (e) change in work 
or living arrangements (move to a different residence, 
move to a different city, take a new job, change 
vocational goals, leave college). 

Behavioral frequencies regarding possible symptom 
substitution consisted of the following 13 items: 


1. In the past two weeks, how many times did 
you seek advice, guidance or counsel from: 
friends?—; spouse/fiance?—; instructor/super- 
visor?—; parents?—; physician?—; others 
(please specify) P——————————— 

2. In the past two weeks, how many times was 
advice, guidance, or counsel offered which you 
did not seek from: (same as #1). 

3. In the past two weeks, how many times did 
you accept advice, guidance, or counsel when it 
was provided from: (same as #1). 

4. Of your close friends and relatives, with 
how many different people would you currently 
feel that you could discuss personal problems 
should the need arise ?— 

5. To how many clubs or organizations do you 
currently belong ?— 

6. How many dances, parties, or similar social 
events have you attended in the past month?— 

7. In the past month, how many events have 
you attended as a “spectator” (such as concerts, 
meetings, sporting events, etc.) ?— 

8. How many times in the past month have 
other persons been to your home (or room) to 
visit you ?— 

9. In the past month, how many times have you 
visited or “gone-out” with another person?— 

10. Of the different people you have visited, 
gone-out with socially, or who have visited you 
in the past month, how many were: males ?—; 
females ?— 

11. How many times have you participated in 
group discussion in the past month?— 

12. In the past three months, how many 
times have you spoken or appeared before a 
group ?— 

13. How many different groups have you ap- 
peared before in the past 3 months?— 


Additional information was requested regarding 
the date and audience size of public appearances in 
order to appropriately analyze the PRCS and S-R 
speech scales. The same self-ratings of specific and 
general improvement which were obtained from 
treated Ss at FU; were also included at FU». 

The procedure for FUs contact ran as follows: 
24 months from the date of treatment termination 
a packet containing the test battery, behavioral 
questionnaires, and rating scales was mailed to the 
last known address of each S. The ‘packet was ac- 
companied by a cover letter explaining the im- 
portance of participation for one last time and was 
otherwise designed to enlist cooperation, including 
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an offer to furnish the results of the investigation, 
This letter set a date 3 weeks in the future by 
which the completed forms were to be returned in 
a stamped, self-addressed envelope which was pro- 
vided. Those Ss not returning forms by the first 
due date were sent a personal letter which further 
stated the importance of their specific participation, 
and a new due date was set 2 weeks hence. The 
Ss not responding to the second letter were then 
sent a complete new packet by registered mail, as 
were those Ss for whom new addresses were nec- 
essary. Those Ss not responding to the third letter 
were personally contacted by telephone and re- 
minded of the importance of returning the data, and 
a promise was elicited to do so immediately. An 
arbitrary cut-off date was set exactly 27 months 
after treatment termination, for determining “non- 
returnee” status of contacted Ss. Thus, although 
FU: was designated as a 2-year follow-up, the actual 
time from termination was 25-27 months, closer to 
2 years from FU; than from treatment termination, 


RESULTS 
Return Rate 


Of first concern was the adequacy of the 
follow-up procedure for locating Ss and elicit- 
ing their cooperation. Even though the sample 
was highly mobile (64% no longer in the 
local area, and 27% out of state or out of the 
country) all treated Ss and all but three con- 
trol Ss were located. Complete data were 
returned by 100% of the treated Ss (N = 
45), and 70% of the controls (N = 31). 
Of the 13 nonreturning controls (10 males, 3 
females), 1 was deceased, 1 was in a mental 
hospital, 1 flatly refused, 7 failed to return 
after multiple contact, and 3 could not be 
located. Thus, the return rate was 79% for 
contacted controls who could return data, still 
significantly lower than the return rate for 
treated Ss (p< .001, Fisher exact prob- 
abilities test). 

Since the purpose of the long-term fol- 
low-up was to determine the effects of the 
specific treatments included in the previous 
outcome study, Ss who received three or more 
sessions of psychological treatment during the 
posttreatment period were excluded from 
further analyses. On this basis, 3 Ss were 
excluded from the insight-oriented group, 4 
were 1 each from systematic-desensitizatio 
and attention-placebo groups, and 12 return 
ing controls; the difference between the pro- 
portion of treated Ss and controls receiving 
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treatment during the follow-up period being 
highly significant (x? = 9.87, df=1, p< 
.01). Additionally, one desensitization S was 
excluded because she was undergoing chemo- 
therapy for a thyroid deficiency at FU2, and 
one control was excluded on the basis of an 
extreme falsification score. While argument 
could be made either for including Ss who 
received additional treatment or for counting 
all such Ss as relapses, the data available on 
such additional treatment is unclear. It ap- 
pears that most of the treated controls, two 
of the treated insight Ss, and the attention- 
placebo S did seek treatment for anxiety- 
related difficulties, while the desensitization S 
and one insight S sought primarily vocational 
counseling. 

Although data obtained at pretreatment 
and FU; revealed no significant differences 
between the treated Ss who obtained addi- 
tional treatment and those who did not, there 
is no question that the retained controls con- 
stituted a biased subsample of the original 
control group. The nonreturning controls 
were found to differ from the retained con- 
trols in showing significantly greater increases 
from pretreatment to FU; (Pre-FU;) on the 
general and examination anxiety scales, and 
a higher rate of academic failure over the 
follow-up period (78% versus 32%). Those 
controls excluded because they received treat- 
ment during the follow-up period also differed 
from retained controls by showing a greater 
Pre-FU, decrease in general anxiety, lower 
extroversion scores, and significantly greater 
increases on all specific anxiety scales. Even 
though there were no differences in demo- 
graphic variables between retained controls 
and those lost or excluded, the retained con- 
trols appear to have improved more from 
pretreatment to FU, therefore raising the 
Possibility that differences between treatment 
groups and controls at FU2 may underesti- 
mate treatment effects. Likewise, if Ss ex- 
cluded on the basis of additional treatment 
teally were cases of relapse, the differential 
exclusion of these Ss would operate most in 
favor of the control group and, secondly, in 
favor of the insight-oriented group, while 
biasing results against systematic desensi- 
tization and attention-placebo treatments. 
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Comparative Treatment Effects from Pre- 
treatment to FUs (Pre-FU2) 


The overall evaluation of treatment effec- 
tiveness is most reasonably made by a com- 
parison of Pre-FU2 changes between groups, 
since Pre-FU, changes had been subjected to 
detailed analysis earlier. Two scales of the 
battery (PRCS and S-R speech) focus specifi- 
cally on performance anxiety in the speech 
situation, the specific treatment target. Unlike 
pretreatment and FU, assesments, however, 
there was no common reference speech for 
PRCS, and the size of audiences to which 
Ss had been exposed varied so widely that 
the separate consideration of S-R speech was 
no longer meaningful. Therefore, these two 
scales were converted to T scores and com- 
bined to form a Speech Composite score 
before analyses were undertaken. While the 
Speech Composite provides evaluation of 
specific treatment effects, the additional S-R 
scales report on performance anxiety in three 
different interpersonal-evaluative situations, 
none of which were the specific focus of treat- 
ment. These latter scales, along with the 
general scales on Social Extroversion, Emo- 
tionality, and General Anxiety, provide in- 
formation on generalization or, conversely, 
symptom substitution, Before carrying out 
the main analyses on the data, the possibility 
of systematic differences attributable to the 
five participating therapists was investigated. 
As was previously found on pretreatment and 
posttreatment stress-condition data and Pre- 
FU; analyses, in no instance for any measure 
were significant or suggestive Pre-FUs differ- 
ences found among the overall (main) effects 
achieved by the five therapists or among the 
effects achieved by different therapists with 
the three different treatment procedures (in- 
teractions). Consequently, the Ss within 
treatment groups have been pooled in the fol- 
lowing analyses. 

The Speech Composite and each of the 
additional scales from the test battery were 
subjected to three-way analyses of variance 
(Treatments, Pre-FUs, Subjects) on the 
scores of Ss retained at FUs. Means and 
standard deviations for all assessment periods 
are presented in Table 1 for specific anxiety 
scales and in Table 2 for general scales. 
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TABLE 1 
MEAN SCORES ON SPECIFIC ANXIETY SCALES AT PRETREATMENT, 6-WEEK FoLLow-up (FU), 
AND 2-YEAR FOLLOW-UP (FU2) For SUBJECTS RETAINED AT FU: 
Scale 

Treatment Testing ees S-R Interview | S-R Examination} S-R Contest 

M SD M SD M SD M SD 
Desensitization Pretreatment | 115.5 9.74 43.2 11.01 46.8 10.32 35.6 7.92 
(N = 13) FU, 85.0 | 16.10 37.4 8.82 43.2 10.81 35.5 7.28 
FU: 82.5 | 16.07 31.5 8.79 36.5 9.28 30.5 6.68 
Insight Pretreatment | 117.7 7.15 37.6 9.67 42.5 10.79 40.8 8.13 
(N = 12) FU; 103.4 | 14.18 35.6 11.94 42.2 12.01 39.1 10.24 
FU: 95.2 | 18.70 31.3 9.42 39.0 8.99 36.3 10.77 
Attention-Placebo | Pretreatment | 110.7 11.98 34.8 7.34 40.6 9.79 36.9 9.69 
(N = 14) FU; 86.4 | 12.47 32.1 7.22 35.9 12.23 34.0 9.75 
FU: 82.9 | 20.85 28.7 8.03 32.1 7.74 28.9 10.40 
Control Pretreatment | 110.9 | 12.20 37.2 12.98 40.7 10.62 33.9 11.51 
(N = 18) FU: 104.3 14.21 34.7 10.16 41.9 11.29 36.3 8.19 
FU: 99.2 | 21.66 32.2 10.98 38.4 11.07 33.2 8.11 


These analyses indicate highly significant 
Pre-FU> changes (p < .01; df = 1/53), not 
only for the Speech Composite (F = 82.70), 
but for all other specific anxiety scales (F = 
35.94, 26.93, 10.39 for S-R Interview, Ex- 
amination, and Contest, respectively) and 
general scales (F= 12.69 and 15.21, re- 
spectively, for Extroversion and IPAT Anx- 
iety Scale) except Emotionality, which only 
approached significance (F = 3.05, p < .10). 
More important, significant Treatment X 
Pre-FUs interactions (df = 3/53) were ob- 
tained for the Speech Composite (F = 3.68, 
< .05) and for S-R Interview (F = 5.14, 
Ż < .01), S-R Examination (F = 6.96, p< 
:01), and IPAT Anxiety Scale (F = 3.46, 
< .05), indicating differential changes 
among groups from pretreatment to the 
2-year follow-up. The nature of these changes 
may be seen in Figure 1, which presents the 
mean change for each group from pretreat- 
ment to FU, and FU; for all scales of the 
test battery. Unlike Pre-FU, changes, where 
significant overall effects were found only 
for speech anxiety and extroversion, the sig- 
nificant Pre-FU2 main effects reported above 
reflect general trends in the improved direc- 
tion for all scales at FU». 


Of the significant Pre-FUs interactions, of 
most interest is the Speech Composite, which 
reflects change in the focal area of treatment. 
Inspection of Figure 1 reveals that all four 
groups maintained their relative positions 
from FU; to FUs, with slight additional 
shifts in the direction of lower mean anxiety 
scores for all groups. As was the case with 
Pre-FU, comparisons, all three treatment 
groups were found to show significant im- 
provement over controls (¢ = 3.70, 2.04, and 
2.38 for desensitization, insight, and atten- 
tion-placebo groups, respectively; p< .05), 
with no significant difference between the 
mean anxiety reduction achieved by the at- 
tention-placebo group and the insight group 
(t <1). Also, like Pre-FU, comparisons, Ss 
treated by systematic desensitization showed 
significantly greater mean Pre-FUs reduc- 
tions in anxiety on the Speech Composite 
than Ss who were treated by insight-oriented 
psychotherapy (# = 2.09, p < .05). However, 
even though the magnitude of the difference 
between mean anxiety-reduction scores of the 
desensitization group and the attention- 
placebo group for Pre-FUs comparisons was 
the same as that of Pre-FU; comparisons, 
these differences were no longer found to be 
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significant at FU2 (¢<1). This was the 
result of greater variability in the Pre-FU> 
change scores of the attention-placebo group, 
primarily due to a drop of 71 points for one 
attention-placebo S. The overall effects be- 
tween these two groups may be seen better 
in the individual data presented below. 
Having found essentially the same results 
to obtain for focal treatment effects at the 
2-year follow-up as at the 6-week follow-up, 
the significant interactions between groups 
and Pre-FU2 change scores on the other 
scales of the test battery become of interest. 
Of the additional specific anxiety scales and 
general scales, a significant interaction effect 
was found only for IPAT Anxiety in the 
earlier analysis of Pre-FU, data. The source 
of that interaction was found in significantly 
greater anxiety reduction for desensitization 
and attention-placebo groups than for con- 
trols. A significant overall increase in ex- 
troversion was also found on Pre-FU; anal- 
ysis, but no significant interaction was ob- 
tained over that time period. As indicated 
above, significant Pre-FU2 interactions were 
again found for IPAT Anxiety and, in ad- 
dition, for S-R Interview and Examination 
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anxiety scales. Inspection of the nature of 
these changes (Figure 1) showed continued 
improvement over the follow-up period for 
the desensitization group on the S-R Inter- 
view scale, such that the Pre-FU. reduc- 
tion for the desensitization group was sig- 
nificantly greater than that for controls 
(t= 1.75, p <.05) and approached signifi- 
cance when compared with insight and atten- 
tion-placebo groups (respectively, ¢ = 1.39, 
1.61; p < .10). The source of the significant 
Pre-FUə interaction for S-R Examination 
was found in significantly greater reductions 
for both desensitization and attention-placebo 
Ss over controls (¢ = 2.44, 1.75; p< .05) 
and for desensitization over insight (¢ = 1.72, 
p < .05). Figure 1 shows that the significant 
interaction obtained on IPAT Anxiety at FU2 
is a result of the combined FU, reduction 
obtained by the desensitization and attention- 
placebo groups as compared to insight and 
control groups, although the latter two groups 
improved sufficiently from FU; to FU% that 
individual between-group comparisons alone 
were no longer significant. By the 2-year 
follow-up, the desensitization group had con- 
tinued to show increased Social Extroversion 


TABLE 2 


MEAN SCORES ON GENERAL SCALES AT PRETREATMENT, 6-WEEK FoLLow-up (FU)), 
AND 2-YEAR FOLLOW-UP (FU2) For SUBJECTS RETAINED AT FU2 


Scale 
` Extroversion- A it IPAT Anxiet: 

Treatment Testing Teoh Emotionality nxiety 

M SD M SD M SD 

Desensitization Pretreatment 14.1 7.58 19.8 6.03 40.7 10.69 
(N = 13) FU: 17.9 8.45 18.9 6.16 38.2 11.18 
FU: 19.9 6.18 17.5 7.08 32.0 10.01 

Insight Pretreatment 16.4 6.57 17.2 5.59 33.7 10.09 
WN = 12) FU: 18.9 4.70 18.3 6.12 35.0 11.72 
FU: 18.9 4.64 15.6 6.56 30.5 12.29 

Attention- treatment 14.1 8.15 18.1 6.02 35.4 9.77 
ay ei bak me ee 17.1 7.68 16.8 7.01 30.7 11.74 
FU: 16.1 7.01 17.1 6.75 28.2 12,12 

Control Pretreatment 17.9 5.53 17.9 5.92 37.7 16.91 
(N 2 18) FU: 20.2 6.30 18.4 6.31 37.7 11.48 
FU: 19.4 6.56 17.2 7.97 33.6 14.34 


SPEECH COMPOSITE 


SR- INTERVIEW 


SR-CONTEST 


PRE FU, FU, 
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EXPLANATION 


@————* DESENSITIZATION (Nel 3) 


O—-——-0 INSIGHT (Ne12) 


H ATTENTION-PLACEBO (N=14) 
= 


DREL SN * CONTROL (N=18) 


IPAT-ANXIETY 


EMOTIONALITY 


EXTROVERSION 


PRE FU, FU2 


TIME OF ASSESSMENT 


Fic. 1. Mean change from pretreatment to 6-week follow-up (FU:) and 2-year follow-up 
(FU2) for Ss retained at FUs, 


scores to the point that the Pre-FU» increase 
in extroversion was significantly greater than 
that of the other three groups (t= 2.06, 
df = 53, p<.05). No other mean group 
comparisons approached significance from 
pretreatment to FUs. 

Although self-ratings of improvement by 
treated Ss had previously failed to discrimi- 


nate between groups, direct ratings of per 
ceived improvement were still included at 
FU, because of widespread usage in other 
follow-up studies. As before, in sharp com 
trast to the specific measures of anxiety 
reduction, no significant differences were 
found among groups on mean self-ratings 0 
improvement. The Ss in all three treatment 
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groups gave mean ratings ranging from 
“somewhat improved” to “much improved” 
for both specific reduction of performance 
anxiety and improvement in other areas. 


Individual S Improvement from Pretreatment 
to FU, 


Since clinical workers are more often con- 
cerned with percentage improvement in in- 
dividual cases than with mean group differ- 
ences, and since negative treatment effects or 
symptom substitution would be more easily 
identified from data on individuals, all test 
data were further evaluated on the basis of 
individually significant Pre-FU, change 
scores. An individual case was classified as 
“significantly improved” on each scale if the 
Pre-FU.2 reduction in anxiety score or in- 
crease in extroversion score exceeded 1.96 
times the standard error of measurement for 
the instrument (two-tailed .05 level, as 
previously determined from a population of 
523, Paul, 1966). Likewise, an individual case 
was classified as “significantly worse” on 
each scale if a Pre-FUs increase in anxiety 
score or decrease in extroversion score ex- 
ceeded 1.96 times the standard error of 
measurement for the instrument. 

Overall Pre-FU2 improvement rates pre- 
sented in Table 3 again disclosed significant 
differences between groups not only for focal 
treatment effects from the Speech Composite, 
but for all other comparisons as well. Par- 
ticularly striking was the finding that not a 
single case retained at FU, in any group 
showed a significant increase in performance 
anxiety. Additionally, the percentage im- 
provement of groups was remarkably con- 
sistent with a similar classification made 
earlier on the basis of pre- to posttreatment 
change from stress-condition data. By com- 
paring the percentage of improved Ss in the 
attention-placebo group with untreated con- 
trols, it was possible to estimate the per- 
centage of Ss responding favorably to merely 
undergoing treatment, over and above the 
base-rate improvement from extratreatment 
experiences thoughout the 2.5-year period— 
28%, Similarly, by comparing the percentage 
of Ss improved under attention-placebo with 
those improved under insight-oriented psy- 


TABLE 3 


PERCENTAGE OF CASES SHOWING SIGNIFICANT CHANG! 
FROM PRETREATMENT TO 2-YEAR 


FOLLOW-UP 

Signifi- N Signifi- 

Treatment cantly hi 2 cantly 

improved | "ange worse 

Focal treatment (Speech Composite)* 
Desensitization 85% 15% — 
Insight 50% 50% — 
Attention-Placebo 50% 50% — 
Control 22% 18% Eti 
All other comparisons (six scales) 

Densensitization 36% 64% = 
Insight 25% 1% 4% 
Attention-Placebo 25% 70% 5%, 
Control 18% 14% 8% 


Note.—N = 13, 12, 14, and 18, respectively, for desensitiza- 
tion, insight, attention-placebo, and control. Classifications 
derived by two-tailed .05 cut-offs on each individual change 
score (see text). 

ay? = 11.64, p <.01, 

by? = 8.11, p <05. 


chotherapy and systematic desensitization, it 
was possible to estimate the percentage of 
additional Ss receiving lasting benefit from 
either the achievement of “insight” or “emo- 
tional re-education,” over and above the non- 
specific effects of undergoing treatment. For 
Ss receiving systematic desensitization, these 
comparisons revealed an additional lasting 
improvement of 35% for focal effects and 
11% for generalized effects over that im- 
provement expected from attention placebo. 
Again, no differences were found between the 
effects achieved by insight-oriented psycho- 
therapy and attention-placebo treatment, al- 
though both produced better improvement 
rates than untreated controls. The “other 
comparisons” in Table 3 also favored a gen- 
eralization interpretation of the effects of 
desensitization for changes found in areas 
which were not the specific focus of treat- 
ment, without the slightest suggestion of 
symptom substitution. Symptom substitution 
would be reflected in higher percentages in the 
“significantly worse” category for both at- 
tention-placebo and desensitization groups. 
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Comparative Relapse and Symptom Substitu- 
tion over the Follow-up Period 


While overall Pre-FU2 evaluations gave no 
suggestive evidence to support the symptom- 
substitution hypothesis, nor any evidence 
that more Ss treated by desensitization and 
attention-placebo programs became signifi- 
cantly worse in any area, no information on 
relapse can be obtained from Pre-FU2 com- 
parisons. Rather, cases of relapse must be 
identified as those cases showing a significant 
increase in anxiety as reflected on the Speech 
Composite from FU; to FU,. Similarly, if a 
symptom-substitution process were operating, 
a higher percentage of change in the “worse” 
direction should be obtained from FU, to FU2 
on nonfocal scales for desensitization and 
attention-placebo Ss who maintained improve- 
ment on the Speech Composite. As noted 
above, the data presented in Figure 1 show 
no evidence of relapse or symptom substitu- 
tion for the groups as a whole from FU, 
to FUs. 

Before concluding that the symptom-sub- 
stitution effects and differential relapse pre- 
dicted by the disease model had not occurred, 
a more sensitive analysis was made of the 
individual data from FU; to FUs. A case was 
classified as significantly worse on each scale 
if from FU, to FU% an increase in anxiety 
on the Speech Composite (relapse) or other 
anxiety score (symptom substitution) or a 
decrease in extroversion score (symptom sub- 
stitution) exceeded 1.65 times the standard 
error of measurement for the instrument 
(one-tailed .05 level cut-offs). The percentage 
of Ss maintaining status versus the percentage 
“getting worse” from FU, to FU for each 
group is presented in Table 4. No significant 
differences between groups were found on any 
measure. In fact, as the figures for the Speech 
Composite demonstrate, there was not a 
single case which could be considered a re- 
lapse in any of the retained Ss from the 
three treatment groups. Additionally, the per- 
centage of scores in the significantly worse 
direction, which would reveal symptom sub- 
stitution, did not differ from the .05 level 
for any group. If Ss who received additional 
treatment during the follow-up period were to 
be included as cases of relapse, the figures 
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TABLE 4 


PERCENTAGE OF CASES SHOWING RELAPSE OR SYMPTOM 
SUBSTITUTION FROM 6-WEEK FOLLOW-UP 
TO 2-YEAR FOLLOW-UP 


Maintained 
FU; status 


Significantly 


Treatment 
worse 


Focal treatment (Speech Composite) 


Desensitization 100% pz 
Insight 100% = 
Attention-Placebo 100% FE 
Control 89% 11%" 


All other comparisons (six scales) 


Desensitization 97%, 
Insight 96% 
Attention-Placebo 94% 
Control 93% 


Note.—N = 13, 12, 14, and 18, respectively, for desensitiza- 
tion, insight, attention-placebo, and controls. Classifications 
derived by one-tailed ,05 cut-offs on each individual change 
score (see text). 

a “Relapse.” 

b “Symptom substitution,” 
would be even less in favor of the predictions 
based on the disease model, with 93% main- 
taining status for both desensitization and 
attention-placebo groups, as compared to 
80% for insight and less than 40% for 
controls. 

The frequency data obtained from the 13- 
item behavioral questionnaire specifically 
constructed to reveal hypothesized symptom- 
substitution effects also failed to provide 
any support for the symptom-substitution 
hypothesis. Kruskal-Wallis one-way analyses 
of variance by ranks over the four groups 
on each item produced an H < 3.66 (p> 
.30) on all items but one. On that item—No. 
9, frequency of social exchange—the value of 
H approached the .10 level of significance 
and was in favor of the desensitization group. 
In fact, a significant coefficient of concord- 
ance (W = .47, p < .01) over all items was 
obtained, with the desensitization Ss receiv- 
ing an equal mean rank with the insight Ss, 
both in the direction opposite to symptom- 
substitution effects. Similarly, Kruskal-Wallis 
analyses over the four groups for frequencies 
of each of the five areas of stress reported 
over the follow-up period failed to reveal 
significant differences between groups (al 
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H < 3,66, p> .30; except C, “Change in 
family structure,” where H = 5.08, p < .20). 
Thus while the occurrence of stress might be 
considered as evidence of symptom substitu- 
tion or an external influence on relapse (Stone, 
et al., 1961), these questions need not be of 
concern in the present study, since no differ- 
ences in the reported occurrence of stress 
approached significance between groups. 


Interrelationships among Variables 


Since the earlier study assessed specific 
improvement through several different in- 
struments, persons, and situations in addition 
to the instruments on which FU: data were 
obtained, information relating to both pre- 
dictive and construct validity of improve- 
ment may be gained through the correlation 
of previous improvement scores with those 
obtained at FU». For systematic agreement 
across different instruments, positive correla- 
tions would be expected between all change 
scores for each measure of performance anx- 
iety. FU, improvement ratings of Ss and 
therapists should be positively correlated with 
Ss’ ratings at FUs. Further, FU; ratings of 


improvement should be negatively correlated 
with Pre-FU, performance-anxiety change 
scores. Opposite relationships would be ex- 
pected for therapist posttreatment ratings of 
prognosis, since these scales were reversed. 

Table 5 presents the correlations of pre- 
to posttreatment  stress-condition change 
scores, therapist posttreatment ratings, and 
FU; ratings of treated Ss with FU» ratings 
of treated Ss and Pre-FUs2 change on the 
Speech Composite and all other scales of the 
test battery. Specific FU: improvement data 
(Ss’ ratings, Pre-FU2 Speech Composite) 
were significantly correlated in the expected 
direction, with all indicants of specific im- 
provement at posttreatment and FUj, except 
for the relationship between the Physiolog- 
ical Composite and the Speech Composite. 
Previous analyses had also failed to find 
significant relationships between physiolog- 
ical and self-report data, although physiolog- 
ical change was significantly correlated with 
observable manifestations of anxiety under 
stress conditions as assessed by the behavioral 
checklist. 

Of the correlations presented in Table 5, 


TABLE 5 


CORRELATION OF PRIOR IMPROVEMENT SCORES WITH ALL CHANGE SCORES FROM 
PRETREATMENT TO 2-YEAR FOLLOW-UP 


Subject FU: Pre-FU: change 
rating of 
Prior improvement data | improvement sped Sk ce oe ny Em o cih 
Ami ee test | Anxiet: *,.. | version 
Specific | Other | posite | view Eram Contest Aniey ality 
Pre- to posttreatment 
stress-condition 
change 
Physiological composite] —.33* | —.31* 11 13 32* .22 46"* lhe Bee 
Behavioral checklist —.34* | —.13 OLS. 07 -20 7 ce AS oe 5 
Anxiety differential —.33* | —.27* .44** .28* .22 .26* 46° wy) — 34 
Standardized therapist 
posttreatment rating 
Specific improvement .30* 45 |—.51**| —18 | —.24 | —.31* | —.38**| 04 p 
Other improvement .02 03 01 07 .00 02 08 —.10 = A 
Specific prognosis —.35* | —.31* .50** 13 .30* Al 24 02 mal 
Other prognosis —.19 | —.19 25 .05 01 sil .16 |—.09 =.32 
Subject FU; ratin, 
Specific iniprovedtient 63%* | .47** | —.56** | —.04 | —.30* | —.19 | —.18 03 .34* 
Other improvement oid E a | —.24 —15 | —.03 | —19 | —.11 08 .20 


Note.—N = 44 for stress condition; N = 39 for ratings. 


*p <05. 
**p <01. 


344 


the relationship of the behavioral checklist 
with Pre-FU assessments is of a special im- 
portance. The behavioral checklist was the 
most objective measure of all instruments 
used and was highly reliable (interrater 
reliability = .96). Additionally, checklist data 
were obtained in a situation where target 
behaviors were most likely to occur, and pre- 
to posttreatment checklist change was con- 
sistently related to all other prior indicants 
of specific anxiety reduction. The correlation 
of .61 between pre- to posttreatment change 
on the behavior checklist with Pre-FU2 
change on the Speech Composite is strong 
evidence for both the construct validity of 
focal improvement at FU, and for the pre- 
dictive validity of observable posttreatment 
improvement, 

Table 5 also reveals discriminative relation- 
ships in the correlations of therapist and 
subject ratings with Pre-FU, improvement. 
Therapist ratings of specific improvement 
and prognosis were significantly correlated 
with Pre-FU, Speech Composite change and 
with FU, ratings of improvement by treated 
Ss. Conversely, therapists’ ratings of general 
improvement and prognosis were not signifi- 
cantly related to specific improvement, al- 
though “other prognosis” was related to Pre- 
FU: change in extroversion. Likewise, Ss’ 
ratings at FU, were significantly related to 
Pre-FU2 change in a discriminative way, al- 
though “method factors” predominate in im- 
provement ratings of Ss as they had earlier. 

The correlation of specific improvement 
data from the earlier time periods with Pre- 
FUs change on the scales of the test battery 
which were not directed towards focal treat- 
ment effects also showed several significant 
relationships. Inspection of the prime cor- 
relations among all variables presented in 
Table 5 found the source of covariation in 
every instance to result primarily from in- 
creased relationships at posttreatment and 
FUs, with several of the prime correlations 
also reaching the .01 level of significance. 
The significant correlations presented in 
Table 5 may be interpreted as evidence for 
the stability of improvement and generaliza- 
tion effects, rather than as a result of relation- 
ships existing before treatment began. Fur- 
ther, when the specific posttreatment and 
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FU, improvement variables from Table 5 
were correlated with FU,—FUz change for 
test battery scales, several low, but signifi- 
cant, coefficients were obtained (Mdn|r| = 
.31), all of which indicated that those Ss who 
showed greatest reduction in performance 
anxiety at posttreatment and FU, also 
showed greatest specific and generalized ad- 
ditional improvement over the period between 
FU, and FUs. Since no significant correla- 
tions were obtained between pretreatment 
scores on the three general scales and change 
on the specific anxiety scales from Pre-FUj, 
FU;-FU2, or Pre-FU,, the slight additional 
improvement from FU; to FU2 may be inter- 
preted as the continuing effects of changes 
taking place during the treatment period, 
rather than as a function of pretreatment 
personality dimensions. 

Further information concerning the sta- 
bility of scores for each scale of the test 
battery over treatment and follow-up periods 
may be seen in the test-retest correlations 
from Pre-FU;, Pre-FUs, and FU;-FU2 
(Table 6). The greater stability of Speech 
Composite scores from FU, to FUs, as com- 
pared to Pre-FU, and Pre—FUz relationships, 
again indicated the influence of treatment 
effects obtaining after pretreatment assess- 
ment, with Ss holding relative positions in a 
reliable manner over the 2 years following 
FU;. However, it appears that relatively 
greater position changes in Extroversion and 
IPAT Anxiety occurred over the follow-up 
period than over the treatment period. 

Intercorrelations of FUs scores for all 
scales of the test battery revealed essentially 
the same relationships as those reported 
earlier for FU, scores. Significant intercorrela- 
tions were obtained among all scales (Mdn r 
= .51), except Extroversion which was sig- 
nificantly related only to the Speech Com- 
posite (r = —.27, p < .05). While the com- 
bined relationships reported above and in the 
earlier study support the assumption that 
FUs measures were internally consistent, the 
reliability of the Pre-FU, change for the 
primary measure can be directly estimated. 
The Pre-FU, changes for PRCS and S-R 
Speech (the scales which were converted to 
T scores and summed to obtain the Speech 
Composite) correlated .64, from which the 
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TABLE 6 


INTERCORRELATIONS OF Eac TEST BATTERY SCALE 
OVER THE THREE TESTING PERIODS FOR 
SUBJECTS RETAINED AT FU; 


Stability coefficient* 
Scale 
Pre-FUi | Pre-FU2 | FU:-FU2 
Speech Composite 27 .29 -68 
S-R Interview .57 .53 .63 
S-R Examination S0 -52 51 
S-R Contest AT 43 64 
IPAT Anxiety -76 44 -63 
Emotionality -80 64 72 
Extroversion 82 59 71 


Note.—N = 57; p = .05,r = .22; p = .01,r =.31. 
a Pearson 7's, 


reliability of the Speech Composite change 
can be estimated (by Spearman-Brown for- 
mula) at .78. 

Although no differences between groups 
were found for the 13 items of the behavioral 
questionnaire, indirect support for the valid- 
ity of the items was obtained through cor- 
relational analyses. Moderate but significant 
correlations were found among the items, 
which clustered in the following way: Nos. 
1, 2, and 3, (Mdn r = .53); Nos. 6, 9, and 10 
(Mdn r = 43); Nos. 3, 5, 8, 11, 12, 13 (Mdn 
r= 35). Only No. 7 was unrelated to other 
items. Numerous significant correlations 
(Mdn|r| = .32) were found between the 
items of the second and third clusters and all 
scales of the FUs test battery, indicating that 
Ss obtaining lower anxiety scores and higher 
extroversion scores also tended to report 
having more close friends, belonging to more 
organizations, attending more social events, 
entertaining more, “going out” more, and 
More frequent group discussions and public 
appearances. Similarly, of the five areas of 
stress on which frequency data were obtained, 
all but one (change in family structure) 
were significantly intercorrelated (Mdn r 
= .35). With one exception, no significant 
correlations were found between reported 
Stress frequencies and items of the behavioral 
questionnaire, nor between either FU,-FU2 
or Pre-FU, change for any scale of the test 
battery and stress frequencies. The exception 
was a significant relationship between the 
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reported frequency of occurrence of change in 
family structure and FU,-FU2 change in 
extroversion (r= —.42, p< .01); that is, 
those Ss increasing in extroversion from FU, 
to FU» tended to report less change in family 
structure over the same time period. 

One last check on the symptom-substitu- 
tion hypothesis was carried out by correlat- 
ing Pre-FU2 change on the Speech Composite 
with all other data. Several significant cor- 
relations were obtained between Pre-FU2 
Speech Composite change and items from the 
behavioral questionnaire, but all were in the 
opposite direction predicted by the disease 
model and favored a generalization interpre- 
tation. Intercorrelations of Pre-FU2 change 
scores among all seven scales of the test 
battery revealed positive correlations be- 
tween change on the Speech Composite and 
change on all other anxiety scales (Mdnr = 
34) and a negative correlation with change 
in Extroversion (r = —.30), Similar relation- 
ships were found among the other scales, with 
positive correlations among all anxiety and 
emotionality change scores and negative cor- 
relations between the latter and change in 
Extroversion. Of the 15 correlations, 10 
achieved statistical significance (Mdn|r| = 
129). 


Discussion 


In general, the combined findings from in- 
dividual and group data as well as correla- 
tional analyses showed the relative gains in 
focal treatment effects found earlier to be 
maintained over the 2-year follow-up period. 
Some additional relative improvement in re- 
lated areas was found for Ss treated by sys- 
tematic desensitization and, to a lesser extent, 
for those treated by attention placebo. Like 
the findings at 6-week follow-up, in no in- 
stance were the long-term effects achieved 
through insight-oriented psychotherapy sig- 
nificantly different from the effects achieved 
with attention-placebo treatment, although 
both groups showed significantly greater 
treatment effects than untreated controls. As 
a group, the systematic desensitization Ss 
continued to show greater positive treatment 
effects than any other group, with evidence 
of additional generalization, and no evidence 
even suggestive of symptom substitution. In 
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fact, the comparative findings at 2-year fol- 
low-up are so similar to the findings at post- 
treatment and 6-week follow-up that the 
detailed discussion of results in relation to 
previous research, theoretical hypotheses con- 
cerning factors and effects within treatments, 
and methodological implications for research 
and clinical practice which were presented 
earlier (Paul, 1966, pp. 71-99) require no 
modification and need not be reiterated here. 
The finding that effects of systematic 
desensitization are maintained over the fol- 
low-up period with evidence of additional 
improvement through generalization is con- 
sistent with the results of the only other con- 
trolled follow-up of systematic desensitiza- 
tion therapy (Lang & Lazovik, 1963) and 
with the suggestive findings from follow-up 
reports of accumulated case studies (Lazarus, 
1963; Wolpe, 1961). Although all previous 
long-term follow-up studies have suffered 
considerably from the methodological prob- 
lems described at the beginning of this report, 
the general trend of results for psychological 
treatment of noninstitutionalized adults has 
been for treatment effects to be maintained 
or slightly improved over the follow-up period 
(Stone et al., 1961). Consistent with this 
trend, the present investigation found no 
relapse for any of the retained treated Ss, no 
matter what treatment they had received. 
While these findings were somewhat sur- 
prising for systematic desensitization and in- 
sight-oriented psychotherapy, the stability of 
improvement resulting only from the non- 
specific effects of attention-placebo treatment 
was almost completely unexpected. This was 
especially true since previous studies of 
placebo responsiveness had not only found 
relapse on 3-6-month follow-up (Gliedman, 
Nash, Imber, Stone, & Frank, 1958), but 
further, that Ss who improved most at the 
time of their initial placebo experience were 
more likely to relapse than those who im- 
proved least (Frank, Nash, Stone, & Imber, 
1963). The difference between the latter ef- 
fects of pure placebo (inert medication) and 
lasting effects of the attention-placebo treat- 
ment of the present investigation may lie in 
changes in attitudes and expectancies result- 
ing from the interpersonal relationship with 
the therapist functioning as a “generalized 
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reinforcer” (Krasner, 1955). Stone et al. 
(1961) point out that the long-term success 
of any form of treatment depends in large 
part on the extent to which changes that are 
accomplished are supported by the client’s 
subsequent life experiences. This fact might 
be extended to suggest that no matter how 
change is brought about, it is likely to be 
maintained in a supportive environment 
which reinforces resulting behavior, and it is 
not likely to be maintained if the resulting 
behavior is not reinforced or if new aversive 
consequences or extreme stress reinstitute 
negative emotional responses. While system- 
atic desensitization produced a more direct 
modification of the emotional reactions as- 
sociated with interpersonal performance situa- 
tions, resulting in significantly higher im- 
provement rates, the emergent behaviors of 
Ss experiencing anxiety reduction from all 
three treatments were likely to be regarded as 
socially appropriate and were likely to be 
rewarded, independently of the manner in 
which change initially came about. 

The usual concern with “spontaneous re- 
mission” rates from other populations need 
not be considered in this investigation, since 
an untreated control group from the same 
population was assessed on the same instru- 
ments as were the treatment groups. Even 
though results were favorably biased towards 
the controls, due to differential loss of Ss, 
superior long-term effects for all treatment 
groups were still obtained, Additionally, the 
22% “improved” without treatment at the 
2-year follow-up for a favorably biased un- 
treated subgroup seriously questions the 
“two-thirds spontaneous remission” rate so 
frequently quoted (e.g., Eysenck, 1966). Of 
course, Lesse (1964) notes: 


The concept of anything that is labeled as “spon- 
taneous” must be considered in the light of the fact 
that it is spontaneous only because we do not 
understand the causes for the change or are at the 
present time unable to measure various factors that 
influence it. In all probability, therefore, so-called 
spontaneous remissions are probably not spontane- 
ous at all [p. 111]. 


There is no reason to believe that factors 
other than the same environmental influences 
which maintained improvement for treated Ss 
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were involved in the improvement and sta- 
bility of untreated controls. In fact, processes 
similar to desensitization may take place 
through environmental interaction in the ab- 
sence of formal treatment (Stevenson, 1961), 
and considerable nonspecific therapy may be 
expected without contacting a socially desig- 
nated psychological helper (Goldstein, 1960). 

While this investigation was able to over- 
come methodological difficulties more ade- 
quately than previous attempts, it still 
suffered from difficulties inherent in the 
nature of follow-up studies. The tight control 
procedures maintained during the earlier out- 
come study were not possible once Ss were 
“turned loose” after the 6-week follow-up. 
When control is not possible, attempts at 
assessment are a second-best choice. Although 
Ss were asked to indicate whether or not 
treatment had been received during the fol- 
low-up period, only 5 indicated that they had, 
when a total of 17 were actually identified as 
having received treatment through a survey 
of clinics and therapists. Considering the 
high return rate for this investigation, the 
problem of Ss not reporting additional treat- 
ment in other studies could be astronomical. 
Even though a higher return rate was ob- 
tained than in previous follow-up studies, 
total assessment of cause-effect relationships 
for treatment groups was not possible due to 
the necessity of S exclusion. Additionally, the 
untreated controls were known to be a favor- 
ably biased subgroup which may have under- 
estimated treatment effects and overestimated 
(un)spontaneous remission. Although the as- 
sessment instruments used possessed adequate 
reliability and validity for determining ef- 
fects, the mobility of the sample precluded 
use of the instrument which was known to 
provide the most objective evaluation (i.e., 
the behavioral checklist). 

These inherent difficulties have led some 
investigators to question the value of long- 
term follow-ups. May et al. (1965) point out: 


formal, controlled studies are doomed to depreciate 
Progressively with the passage of time from the end 
of the controlled treatment period with much of 
their discriminating power being eroded by con- 
tamination . . . it is inevitable that the longer 
the follow-up, the more all treatments approxi- 
mate the same end result [p. 762]. 
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On the basis of their own research, Stone 
et al. (1961) state further that, “evaluation 
of different forms of psychotherapy should 
be primarily in terms of their immediate 
results [p. 420].” In essential agreement, 
the stability of treatment effects over the 
2-year follow-up period in the present study, 
combined with the failure to find a single case 
which could be considered evidence of relapse 
or symptom substitution for any treated S, 
suggests that the short-term follow-up pro- 
vided adequate evaluation of comparative 
treatment effects. Thus, for the evaluation 
of psychological treatment with noninstitu- 
tionalized adults, more scientifically useful 
information is likely to be obtained if future 
efforts are directed towards short-term follow- 
ups, in which total sample assessment of 
treated Ss may be obtained, rather than 
longer follow-ups, which suffer from dif- 
ferential attrition and the effects of uncon- 
trolled environmental influences. The number 
and timing of follow-ups should be determined 
by the nature of the population and problem, 
rather than preconceived theoretical notions 
(Paul, in press). 

However, the methodological difficulties of 
follow-up studies should not overshadow the 
major findings of the present investigation. 
Namely, that modified systematic desensitiza- 
tion produced significant and lasting reduc- 
tions in maladaptive anxiety, not only on an 
absolute level, but also in comparison with 
other treatment and control groups. None of 
the effects predicted on the basis of the 
traditional disease-analogy model were forth- 
coming, while considerable evidence was 
found for a learning model. Results as con- 
sistent as these are rare in the psychotherapy 
literature and require not only replication, 
but also an extension of evaluations across 
differing populations of clients, therapists, 
and problems, as well as parametric investiga- 
tions of the mechanics involved. 
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PREDICTION OF COMMUNITY STAY AND EMPLOYMENT 
FOR RELEASED PSYCHIATRIC PATIENTS 


THEODORE W. LOREI1 
Veterans Administration Hospital, Washington, D. C. 


The purpose of this study was to identify characteristics of released psychi- 
atric patients and their relatives that are associated with success in remaining 
in the community and obtaining employment. 215 male patients were studied 
at the time of release and followed for 1 yr. 11 and 13 variables were 
significantly (p < .05) correlated with community stay and employment, 
respectively. The predictor intercorrelation matrix was factored, and patients 
were scored on the resulting 6 Varimax factors. A series of regression analyses 
was run using these factor scores as independent variables, Distress/Alienation 
and Drinking/Antisocial Behavior were the most important predictors of length 
of commitnity stay; Chronicity/Severity of Disorder and Simple-Mindedness 
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In an earlier paper, the writer reported an 
attempt to predict whether psychiatric. pa- 
tients would remain in the community for at 
least 9 months after release (Lorei, 1964). 
The California Psychological Inventory (CPI; 
Gough, 1957), the Waco Social Adequacy 
Scale (Pinchak & Rollins, 1960), the Opin- 
ions about Mental Illness Questionnaire 
(OMI; Cohen & Struening, 1962), and back- 
ground data from records were selected as 
hypothesized predictors. In that study the 
Waco, two of the background variables, and 
two of the OMI scales (from relatives of pa- 
tients) were significantly related to staying in 
the community. The present study is con- 
cerned with identifying additional patient and 
telative characteristics that are related to re- 
lease outcomes. 

This study differs from the earlier one in 
that the follow-up period was longer, both 
Continuous and dichotomous length of com- 
Munity-stay measures were used, an employ- 
Ment criterion was added, and a larger sample 


used. The battery of hypothesized predictors” 


Was developed so that areas showing promise 
In the first study were assessed more compre- 
hensively. Some further discussion of the ra- 
tionale for predictor selection will be given in 
the Method section. 

Tf one distinguishes between the explaining 
and forecasting purposes of “prediction rey 

+The author expresses his indebtedness to many 
Veterans Administration personnel, particularly the 
Members of the Social Work Service at Veterans 
Administration Hospital, Lyons, N. J. 


= were the most important predictors of employment. 


search,” then it should be said that this study 
is more oriented to the former than to the 
latter. While it is recognized that correlational 
studies may not provide definitive answers to 
why patients return to the hospital or fail to 
adjust occupationally, they seem to be reason- 

aable and necessary beginnings. Prediction 
formulas are not presented here, not because 
the use of probability information in making 
decisions about release or outpatient treat- 
ment is considered unimportant, but because 
if seems that more time should first be given 
to forming more valid predictor batteries. 


. METHOD 


The sample contained 215 male veterans consecu- 
tively released from the Veterans Administration 
Hospital, Lyons, New Jersey, between August 1962 
and May 1963, after having been treated for psychi- 
atric disorders. Patients who were over 60 years, 
carried a diagnosis of central nervous system pa- 
thology, or were unable to complete the quéstion- 
naires, were excluded, The mean age of the study 
group was 39.7 (SD =7.7) and the mean education 
was 10.1 (SD=2.5). Of these, 39% were never 
married and 16% were Negro. The median length of 
all time spent in psychiatric hospitals prior to re+ 
lease was 264 months (range 1-229) and the median 
length of the hospitalization just prior to release 
was 9.2 months (range 1-229). Of the patients, 77% 
were schizophrenic, 17% were psychoneurotic, and 
6% garried other functional psychiatric diagnoses. 


Predictor Measures—Patient 


Records. Veterans Administration records were 
searched to obtain data on age, race, marital status, 
education, diagnosis, percentage of adult life spent 
in a psychiatric hospital, whether or not the patient 
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had been hospitalized in the 2 years before the 
current or “index” admission, and whether or not 
the patient had a serious drinking problem or had 
committed a serious crime. 

Questionnaires. Two questionnaires were adminis- 
tered to the patients just before they left the hos- 
pital. They were: (a) the Palo Alto Social Back- 
ground Inventory (PASBI), a 91-item true-false 
questionnaire dealing with life history information, 
current social functioning, and psychiatric sympto- 
matology developed by Ullmann and Giovannoni 
(1964) and (b) the Work and Family Attitude Scale 
(WFAS), a 75-item Likert-type questionnaire deal- 
ing with patient opinions about work and family 
life. Of these items, 55 were taken from the Work 
Value Scale (Struening & Efron, 1965). At the time 
of questionnaire administration the patient was asked 
where he intended to live after release and whether 
he had been employed full time for at least 6 months 
in the 5 years preceding this release. 

Although general personality inventories have not 
been promising as prognostic variables, it was hoped 
that questionnaires with content written especially 
to assess functioning hypothesized to be relevant to 
posthospital performance would be more successful. 

Ratings. The Patient Rating Scale (PRS) consists 
of 11 items taken from the Veterans Administration 
Program Evaluation Staff’s SHAVER 2 and 16 items 
written to measure normal traits, for example, domi- 
nance and dependability. The PRS was completed by 
clinical social workers who had observed the pa- 
tients over a period of time. 


Predictor Measures—Relative 


Lorei (1964) and Freeman and Simmons (1963) 
found that relatives’ attitudes toward mental ill- 
ness were modestly related to length of community- 
stay criteria. Since the OMI, the instrument the 
writer used in his first study, was written for hos- 
pital personnel rather than for relatives, a 60-item 
Likert-type opinion questionnaire was written that 
dealt more specifically with issues likely to be of 
concern to relatives of returning patients. This ques- 
tionnaire, labeled the Survey of Opinion Question- 
naire (SOQ), also contained 25 of the original OMI 
items. The SOQ was mailed to relatives within 5 
days after each patient’s release. 

In order to reduce the number of scores yielded 
by each instrument to manageable size, the item 
intercorrelation matrices of the PASBI, the WFAS, 
the PRS, and the SOQ were factor analyzed (princi- 
pal component—Varimax), and the subjects scored 
on the resulting factors. Space precludes describing 
the content of the factor scales here; however, the 
factor labels will give a fairly good idea of the scale 


2The SHAVER (symptom, history, and voca- 
tional expectation report) is an interview rating 
form for use by professional personnel. This form is 
a refinement and extension of the Symptom Rating 
Scale reported on by Cohen, Gurel, and Stumpf 
(1966) . 
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TABLE 1 
OUTCOME INTERCORRELATIONS 


Outcome criterion ICD: ICDsss Work* 
In-community days, con- 86**  88**  —33* 
tinuous (ICD) 
In-community days, total 79%* = —30** 
(ICDi) 
In-community for 1 year —31** 
(CD36)? 


Note.—Decimal points omitted; N = 215. 
a1 = yes, 2 = no. 

b1 = no, 2 = yes, 
** p < .01, two-tailed test. 


content, and illustrative items of each of the cri- 
terion-related scales are given.® 


Criterion Measures 


The two primary outcome criteria were the num- 
ber of days during the first year the patient spent 
continuously in the community without interruption 
by hospitalization (ICD.) and whether or not the 
patient was employed full time for at least 6 months. 
The total number of days the patient spent in the 
community, even if he was rehospitalized and then 
released again (ICD:), and whether the patient re- 
mained out for 365 days (ICDsss) were also deter- 
mined. Criterion data were obtained by question- 
naire and record search. 


RESULTS 


Of the 215 patients, 83 (39%) returned to 
the hospital during the first year; of these, 56 
returned to the community again within the 
same year. The mean number of ICD, was 
279.1 (SD = 122.8) and the mean number of 
ICD; was 310.4 (SD = 86.9). These distri- 
butions were skewed because of the large 
number remaining out for at least 1 year. Of 
the 215, 52 (24%) worked full time for at 
least 6 months. The percentage returning dur- 
ing the first year is very similar to the 38.2% 
found by Freeman and Simmons (1963), and 
42.1% found by Gurel (1966). The inter- 
correlations of the four outcome criteria atè 
presented in Table 1. 

In reading the remaining analyses it should 
be noted that sample sizes differ for different 
sets of predictors. This situation results be 


8 Copies of the Varimax factor matrices, aS well 
as tables containing scoring formulas, intercorre- 
tions, and internal consistency reliabilities of facto? 
scores for the PASBI, WFAS, SOQ, and PRS may 
be obtained from the author. 
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TABLE 2 
VARIMAX FACTOR MATRIX AND PREDICTOR-CRITERION CORRELATIONS 


f Factor Criteria 
Variable N tii I EE V Va NVA k ICD Work* 
Adult life hospitalized (%) 215 = 78" —07 —09 —15 03 —04 65 03 239 
Being married (PASBI) 215183 —78 01 —16 —02, 03 03. 63. 00 —27** 
Marital status (1=never, 215 — —75 01 —04 —01 05 12 58 04 —23** 
2=other) 

Recent job (1=yes, 2=no) 215. — 64 12 02 11 05 05 44 00 30% 
Chronicity (PASBI) 215 63 63 11 —10 —01 10 20 47 —06 Sid 
Manifest psychoticism (PRS) 100 83 poets Os 19 —08 11. 57 ucan 23* 

Competence (PRS) 100 80 —48 —34 —04 —24 18 —44 63 13 —24* 
Diagnosis (1=psychotic, 2=other) 215 — —46 06 11 26. 29 —08 39 —06 —10 
Social adequacy (PASBI) 215 63 —43 —17 —10 —20 02 17 29 —11 —04 


Hospitalized 2 yr, before admis- 215 =35 02° =06 "26 —30 —09 -29 17e ae 


sion (1=yes, 2=no) 


Stressful existence (WFAS) 215 88 06 86 07 06 —06 00 75 —10 10 
Negative evaluation of family 215 84 20 78 20 —11 05 03 70 —14* LEA 
(WFAS) 

| Alienation (WFAS) 215 76 —03 76 20 —21 10 —04 67 EA AAT 


Expectation of special con- 215 73 27° 71700" =06/ 17 18 64 —09 16* 


sideration (WFAS) 
© Professed disability (PASBI) 215 82 —09 61 —13 28, , 24. —12 Tae Me 16* 


. Well-being (PASBI) 215 74 25. —55 13, —15,,—09 | 38. 56 16* 09 
Emotional stability (PRS) 100 74 —19 —45 125 OOTES — 1a Ar Seat 
Broken home (PASBI) 215 65 00 43 —08 TOIL 197 130322206) 00 
Patient depreciation (SOQ) 153 61 —03 "03 71 03 —08 05 52 —19* 07 
Alienation (SOQ) 153 89 19 17° 71 —04 =—12 00) $9 —05 11 


153 56 —08 —15 67 —03 21 —14 54 —15 —07 
—01 54 —02 —05 —03 30 —05 08 
UR wert S 02 


Independence requirement (SOQ) 
Inclination to hospitalize (SOQ) 153 68 07 
Interpersonal etiology (SOQ) 153 62 —09 05 44 —03 


Perceived degree of disorder 153 64 03. 15. 39. 14° 0021) A 16% 09 
(SOQ) 

Benevolence (SOQ) 153 63 —05 —25 33 02 27 20 29 —13 04 

Lack of confidence (PRS) 100 86 25 O07 02 66 —16 27 ót: —03 11 

Reports self mentally ill (PRS) 100 — 04 —03 —08 52 —07 —31 38 —01 —09 

Tension (PRS) 100 — —05 10s 05 it 5 205,19) atd po 

Race (1=white, 2=other) 215 14 29 05 —43 02 11 31. —04 16 

Family educational level 215 59 —10 —07 O1 —41 21 —03 23 —06 03 
(PASBI) 

Year of birth gpi 29 09 —17 —39 —06 —26 34 —06 01 

Admissi inki 215 68 —15 10 i Aadi l 64 01 45 —19"* 06 
mission of drinking (PASBI) 1 Bose Gime ciel Osu ames iis 


Antisocial behavior (PASBI) 215 54 04 13 
Drinking problem (1=none, 215 — —23 10 05 —12 57 —02 41 —24** 07 


2=some) 

iing arrangements (1=alone, 215 — —09 08 —18 —20 -54 —01 38 04 —O1 

=with others) 

Law trouble (1=none,2=some) 215 — 14 -04 -11 —14 50 -06 31 04 04 
** 

Vncritical optimism (WFAS) | 215 80 -05 -03 -0 -0 Di s A bie 

education S= ey = Bs F F es 

Well-being (WFAS) 215 52 03 —27 09 —32 —04 a A Wp He 

ositive work evaluation 215 67 —33 08 —09 -12 14 

(WFAS) 

Familial loyalty (SOQ) peas, 0) oe aay Cohen ue dy 


Note.—Decimal points omitted; ri, column indicates Spearman-Brown approximation of the Kuder-Richardson 20 formula 
for internal consistency reliability. 
“1 = yes, 2 = no. 
* p <.05, two-tailed test. 
**'> <.01, two-tailed test. 
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cause relative attitude data were available 
only for patients who returned to live with 
relatives and whose relatives returned the 
questionnaires. The questionnaire return rate 
was 81%. Data on PRS were available only 
for those patients known to the social work 
raters. 

The zero-order correlations of each of the 
predictors with continuous in-community days 
and 6 months full-time employment (present 
or absent) are presented in the last two col- 
umns of Table 2. 

Inspection of the zero-order correlations 
shows that success in remaining in the com- 
munity is associated with not being hospital- 
ized in the 2 years prior to index admission, 
not having a drinking problem (according to 
records and questionnaire), not “professing 
disability,” reporting a feeling of “well-being,” 
disagreeing with items expressing “negative 
evaluation of family” and “alienation,” and 
being rated low on “manifest psychoticism” 
and high on “emotional stability.” The rela- 
tives of successful patients tended to see them 
as less disabled and to disagree with attitude 
statements indicating “patient depreciation.” 

Success in obtaining and holding a full- 
time job for at least 6 months is associated 
with being white, married, having been in 
the hospital a small percentage of adult life, 
not having been hospitalized in the 2 years 
prior to index admission, having a recent job, 
not “professing disability,” not being “un- 
critically optimistic,” not “expecting special 
consideration,” not having a “negative evalu- 
ation of family life,” and being rated low on 
“manifest psychoticism” and high on “compe- 
tence.” The relatives of successful patients 
tended to disagree with statements expressing 
“family loyalty.” Illustrative items from each 
of the factor scales that are significantly cor- 
related with ICD or work are presented in 
Table 3. 

Two questions arise in interpreting the 
results of Table 2: 

1. Which of the relatively large number of 
correlations reported are likely to be signifi- 
cant by chance alone? 

2. How many independent patient and rel- 
ative variables are being assessed by the mea- 
sures used? 

To obtain data on the first question, the 
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total sample and the two subsamples (of the 
total sample) were split, and predictor-cri- 
terion correlations were computed in each 
half. The following variables are correlated 


-( < .10) with ICD in both halves of the 


original sample (or subsample): professed dis- 
ability (PASBI), emotional stability (PRS), 
and admission of drinking (PASBI). Per- 
centage of adult life hospitalized, being mar- 
ried (PASBI), marital status, recent job, and 
uncritical optimism (WFAS) are significantly 
correlated with work. The more lenient sig- 
nificance level was used to adjust partially for 
the loss of test power that results from de- 
creasing the sample size. This cross-validation 
probably gives an overly conservative esti- 
mate of the true number of significant corre- 
lations, because the significance tests are nec- 
essarily less powerful in the subsamples than 
in the total sample. Nevertheless, this pro- 
cedure does give a minimum estimate of the 
number of “truly significant” correlations. 

To study systematically the predictor- in- 
terrelationships and to determine how many 
relatively independent variables were being 
assessed by the predictor battery, the pre- 
dictor matrix was factor analyzed. Ten prin- 
cipal component factors were extracted with 
unities in the diagonal, and sets of 10 through 
4 of these factors were rotated by the normal 
Varimax procedure. The six-factor set was 
judged the most satisfactory on the basis of 
interpretability and factor-scale internal con- 
sistency reliability. It is presented in Table 
2. Since many of the predictors themselves 
were products of previous factor analyses, the 
present factors may be considered as “second 
order.” 

The six factors accounted for 44% of the 
total test variance. The interpretation of ea! 
factor is summarized by the following labels: 
I, Chronicity/Severity of Disorder (24% of 
common factor variance) ; IT, Distress/Aliena- 
tion (23%); III, Patient Depreciation 
(13%); IV, Drinking/Antisocial Behavior 
(12%); V, Feeling of Inadequacy (13%); 
and VI, Simple-Mindedness (14%). Factor 
scores were computed and used in a series 0 
regression analyses to predict ICD and work. 
Using the factor scores rather than the orig- 
inal variable scores has the advantage of 
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TABLE 3 
ILLUSTRATIVE ĪTEMS FROM SALES SIGNIFICANTLY CORRELATED WITH ICD or Work 
High 
scoring 
Questionnaires direction 
Professed disability (PASBI) 
My nervousness makes it harder for me to do things well. true 
My poor health has kept me from holding down a steady job. true 
Well-being (PASBI) 
I am happy most of the time. true 
I have often had trouble getting along with my boss or supervisor. false 
Negative evaluation of family (WFAS) 
Although they won’t come right out and admit it, even members of your own family will never agree a lot 
completely trust you after you have been in a mental hospital. 
Members of your own family will show you much more consideration and sympathy if you are agree a lot 
physically ill rather than mentally ill, 
Alienation (WFAS) 
It is hard to figure out who you can really trust these days. agree a lot 
These days a person doesn’t really know who he can count on. agree a lot 
Uncritical optimism (WFAS) 
Parents usually treat their children fairly and sensibly. agree a lot 
The best way to get along in the world is to make careful plans for the future. agree a lot 
Expecting special consideration (WFAS) 
Doctors and other people who work in mental hospitals just don’t understand how hard it is for agree a lot 
a discharged mental patient to look for and find a job. 
Many discharged mental patients should never try to get a job because it upsets them and will agree a lot 
make them sick again. 
Patient depreciation (SOQ) 
Many mental patients use their sickness as an excuse for getting out of things they don’t want agree a lot 
to do. 
Many patients remain in the hospital because they are too lazy to work, agree a lot 
Perceived degree of disorder (SOQ) 
Degree to which patient’s speech or behavior seems odd or abnormal. extremely 
Degree to which patient’s coming home will cause inconvenience to the family. extremely 
Family loyalty (SOQ) 
A wife should stick by her husband even when mental illness seriously damages his personality. agree a lot 
A person should always be willing to help a member of his own family even if it interferes with agree a lot 
his personal plans. 
High 
scoring 
Rating scales direction 
Manifest psychoticism (PRS) 
Shows paranoid suspicion. markedly 
Shows impaired reality testing. markedly 
Emotional stability (PRS) 
Casual, undependable versus conscientious, persistent. casual 
Emotional, unstable versus mature, calm. emotional 
Competence (PRS) 5 
Dull, low capacity versus bright, intelligent. bright 
not at all 


Shows peculiar behavior. 


í 
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TABLE 4 


REGRESSION ANALYSES FOR CONTINUOUS In-CommunITY Days AND WORK 


In-community days 


Work (1=yes, 2=no) 


Criterion Variance Criterion Variance 
Factor Beta correlation explained (%) Beta correlation explained (%) 
I Chronicity/Severity of Disorder .00 —.01 00.0 res bine 33** 11.0 
II Distress/Alienation —.14* —.16* 02.0 Sl .14* 02.0 
V Drinking/Antisocial Behavior =.15*  —.17* 03.0 .09 .07 01.0 
VI Simple-Mindedness .09 Al 01.0 Hah +198 04.0 
Note.—R = ,24* for in-community days, R = .42** for work; N = 215. 
*p <,05. 
** p < 01. 


greatly increasing the degrees of freedom for 
the regression problem. Additionally, since it 
may be assumed that the factor-score matrix 
represents that part of the predictor scores 
which is free from error or random variation, 
the resulting multiple correlations should be 
quite stable estimates of population values. 

Since the data necessary to score all pa- 
tients on all six factors were not available, 
Factors I, II, V, and VI were scored for the 
complete sample, Factor III was added for 
the relative attitude subsample, and Factor 
IV was added for the PRS subsample. The 
multiple-regression analyses are presented in 

_ Tables 4, 5, and 6. 

The patient who is high on Distress/Alien- 
ation (Factor II), on Drinking/Antisocial 
Behavior (Factor IV), and whose relative is 
high on Patient Depreciation (Factor III), 
tends to have a short stay in the community. 
The patient who is high on Chronicity/Sever- 


ity of Disorder (Factor I) and Simple-Mind- 
edness (Factor VI) tends not to work.* 

The same regression analyses were repeated 
for only the psychotics in the total sample and 
the two subsamples. For the 169 psychotics of 
the total group, the multiple R for ICD in- 
creased from .24 to .28, and for work re- 
mained at .42. For the 122 psychotics of the 
relative attitude sample, the multiple R for 
ICD increased from .32 to .35, and from .48 
to .51 for work. Finally, for the 74 patients 
of the PRS sample, the multiple R for ICD 
rose from .37 to .44, and from .53 to .56 for 
work. The pattern of significant beta weights 
remained the same for the psychotics as for 

#In those instances where a predictor-criterion 
correlation is significant in the largest sample or sub- 
sample on which the predictor is available, it is 
considered significant even though it may not have a 
significant correlation for this next smaller group. 
This seems reasonable because the most stable popu- 
lation estimate should come from the larger group. 


TABLE 5 
REGRESSION ANALYSES FOR CONTINUOUS IN-Community Days AND WORK 


In-community days 


Work (1=yes, 2=no) 


Criterion Variance Criterion Variance 
Factor Beta correlation explained (%) Beta correlation explained (7) 
I Chronicity/Severity of Disorder —.02 —.02 00.0 A0** oF 15.0 
II Distress/Alienation —.08 —.13 01.0 08 14 01.0 
III Patient Depreciation (relative) —.18* —.19* 04.0 05 08 00.0 
V Drinking/Antisocial Behavior —.21** —.23** 05.0 14 06 01.0 
VI Simple-Mindedness .05 .09 01.0 25 22% 05.0 


| 
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TABLE 6 
REGRESSION ANALYSES FOR CONTINUOUS ĪN-COMMUNITY DAYS AND Work 
In-community days Work (1=yes, 2=no) 
Criterion Variance Criterion Variance 
Factor Beta correlation explained (%) Beta correlation explained (%) 
I Chronicity/Severity of Disorder —.13 =.15 02.0 -41** .41** 17.0 
II Distress/Alienation —.24* —31** 07.0 -04 16 01.0 
III Patient Depreciation (relative) —.05 —.08 00.0 —.02 .06 00.0 
IV Feeling of Inadequacy 05 05 00.0 —.01 —.02 00.0 
V Drinking/Antisocia) Behavior ay bt ete 04.0 15 04 01.0 
VI Simple-Mindedness .03 04 00.0 i bar „30t 09.0 


Note.—R = .37* for in-community days, R = .53** for work; N = 100, 


*p <.05, 
* p <01; 


the total sample, with the exception that 
Drinking/Antisocial Behavior is not a signifi- 
cant predictor of ICD, but in the two sub- 
samples it is a significant predictor of work. 

While, as noted above, there are important 
advantages to using the six-factor scales rather 
than the original variables as predictors, the 
regression results should be considered in con- 
nection with the data presented in Table 2. 
First of all, if one were looking for an eco- 
nomical prediction battery, he could select 
only one or two representative variables from 
each of the factors rather than scoring on the 
total set of variables. Second, while the six 
factors give a general picture of what is im- 
portant for prediction, a more specific view 
can be obtained by inspecting the zero-order 
correlations of the items defining each factor. 
In some instances individual variables have 
higher criterion correlations than the factor 
scores. This probably results because the indi- 
vidual variable’s specific and error variance 
can be shared with the criterion but cannot 
be shared with the other variables defining the 
factor, It also seems that factor-scale validi- 
ties are attenuated by the noncontributing 
variables defining the factors. 


DISCUSSION 


The findings of some of the more recent 
studies concerned with predicting release out- 
comes are discussed in the earlier report 
(Lorei, 1964). Since that review, Johnston 
and McNeal (1965) found that patients 
whose MMPI exit profiles were judged im- 
proved over their admission profiles stay out 


of the hospital significantly longer than pa- 
tients who did not show this improvement. 
Consequently, they advocate change measures 
for the prediction of posthospital adjustment 
criteria. 

In general, it is difficult to compare the find- 
ings of other studies because of differences in 
sample definition and the great diversity of 
predictor and criterion measures. The factor- 
ing of the predictor matrix in this study seems 
to clarify considerably the meaning of the 
individual variables by establishing them as 
indicators of more basic constructs. Zigler and 
Phillips (1961) have complained, for exam- 
ple, that the continued piecemeal investiga- 
tion of case history items offers little of 
heuristic value toward an understanding of 
prognosis. Factor analysis makes the meaning 
of these kinds of variables clearer by com- 
bining them with other measures of clearer 
personological implication. 

The most general conclusion of the present 
study is that both length of community stay 
and employment are predictable, although the 
errors of estimate are large. In words that do 
not have the forecasting implication that “pre- 
diction” has, one can say that the measures 
used assessed attributes that enter into the 
determination of release outcomes or, at least, 
are correlates of such determinants. 

It may be useful to offer a few explanatory 
hypotheses about each of the factor-scale— 
criterion correlations. At least two overlapping 
possibilities come to mind about the associ- 
ation between high Distress/Alienation scores 
and short community stay. High scores may 


356 


indicate a greater vulnerability to posthos- 
pital stress or may indicate a patient who 
is more likely to seek rehospitalization to re- 
lieve discomfort that may result from com- 
munity living. This second possibility sug- 
gests the desirability of determining in future 
studies whether the patient himself sought 
readmission or whether he was hospitalized 
on someone else’s initiative. The association 
between serious drinking problems and early 
return requires no comment. 

The correlation between Patient Depreci- 
ation and early return is consistent with 
earlier findings that “negative” relative atti- 
tudes are prognostically unfavorable (Free- 
man & Simmons, 1963; Lorei, 1964). The 
significance (p < .05) of this variable’s beta 
weight in the regression analysis (N = 153) 
suggests that it is of importance in its own 
right and is not merely a correlate of less 
favorable patient condition. Intended living 
arrangement, both when treated dichoto- 
mously (alone versus with others) and when 
categorized according to the specific relative 
involved, was not significantly associated with 
any of the outcome criteria. 

The patient attributes relevant to posthos- 
pital work are nonoverlapping with those re- 
lated to his staying in the community. That 
Severity/Chronicity of Disorder is related is 
as expected. The writer has no hypothesis to 
offer about why high scores on Factor VI are 
associated with not working. While one might 
think the “all is well” attitude would be char- 
acteristic of the chronic patient, the correla- 
tion between Factor Scale I and Factor VI is 
only .05. 

The following implications of this study for 
future research occur to the investigator. First, 
while the criteria used here are certainly so- 
cially significant, it seems that others should 
be added. Measurement of additional criteria 
would add to the explanatory and decision- 
making usefulness of prediction studies. Addi- 
tional criteria might include the occurrence of 
assaultive or suicidal behavior, the occurrence 
of hallucinatory or delusional experiences, se- 
vere anxiety, and depression, etc. 

Second, it seems clear that patient ques- 
tionnaire data of the type used here are use- 
ful. Nevertheless, it also seems that the as- 
sessment provided uniquely by the question- 
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naires could be obtained with considerably 
fewer questions than were used. The same 
may be said about the relative attitude ques- 
tionnaires. A general “pro-mental patient” at- 
titude seems relevant to length of community 
stay, but it does not take 60 items to measure 
it. It would seem useful to determine some of 
the relative behavioral correlates of their ex- 
pressed attitudes, as Ellsworth (1965) did 
with nursing personnel. The patients them- 
selves could provide the behavior ratings as 
was done in Ellsworth’s (1965) study. It 
would also seem desirable to extend attitude 
measures to all adult members of the family 
group. 

Third, it seems reasonable, a priori, that 
clinicians should want to quantify their as- 
sessment of patient functioning by means of 
rating scales, and the results of this study 
show that such assessment does relate to out- 
come criteria. It is the investigator’s belief 
that ratings based on a continuing period of 
interaction with the patients (as were the PRS 
ratings) are likely to be more highly related 
to outcome criteria than ratings made on the 
basis of observations made during a single 
interview. It also seems that the rating scales 
should not be limited to “symptoms” nar- 
rowly defined, but should cover the broader 
personality sphere in so far as it is accessible 
to raters. 
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3 studies conducted at 3 different universities tested the hypothesis of a relation- 
ship between a sensation-seeking tendency and volunteering for experiments 
in hypnosis and sensory deprivation. Male and female undergraduates who 
volunteered for hypnosis experiments were found to be significantly higher 
on the Sensation-Seeking Scale (SSS) than nonvolunteers. Females volunteer- 
ing for sensory deprivation experiments were higher on the SSS than non- 
volunteers, but male volunteers were significantly higher in only 1 of the 2 
samples tested. Marked university population differences were found on the 
SSS, and these differences bore some relation to the magnitude of differences 
between volunteers and nonvolunteers within each sex group. No relations 
between birth order, hypomania, anxiety, and volunteering were found in the 


University of North Carolina 


studies where they were examined. 


Most psychological experiments involving 
human subjects rely on the selection of volun- 
teers. The representativeness of these volun- 
teers is a serious question limiting generaliza- 
tion of results, even to the limited population 
of college undergraduates. Generally, the non- 
volunteer population is not studied, and the 
implicit assumption is that the volunteers are 
a representative sample of the population 
from which they were recruited. Reviews of 
the studies which have compared volunteer 
and nonvolunteer samples give serious reasons 
for doubting the aforementioned assumption 
(Bell, 1962; Rosenthal, 1965). Rosenthal 
summarized the literature by stating that 
volunteers, relative to nonvolunteers, tend to 
manifest greater intellectual ability, interest, 
and motivation and tend to be more uncon- 
ventional, younger, less authoritarian, and 
more sociable. Bell feels that the literature 
indicates that the volunteers tend to be less 
well adjusted and less socially extroverted 
than nonvolunteers. Schubert (1964) has of- 
fered an interesting hypothesis to explain his 
data, which show that volunteers for psy- 
chological experiments tend to report more 


1 This investigation was supported in part by 
United States Public Health Service Research Grant 
MH-07926 from the National Institute of Mental 
Health. 


coffee drinking, caffeine pill taking, and ciga- 
rette smoking and are higher on the Psy- 
chopathic Deviate and Hypomania scales of 
the MMPI. Schubert feels that volunteers 
have a strong trait which can be described as 
“arousal seeking.” The concept of arousal 
seeking is similar to Zuckerman, Kolin, Price, 
and Zoob’s (1964) concept of “sensation 
seeking.” These authors developed a scale 
aimed at measuring individual differences in 
“optimal level of stimulation” (Leuba, 1955) 
or “sensoristasis” (Schultz, 1965). The Sensa- 
tion-Seeking Scale (SSS) has been applied in 
sensory deprivation (SD) and confinement 
experiments (Zuckerman, Persky, Hopkins, 
Murtaugh, Basu, & Schilling, 1966; Zubek *) 
where high sensation-seeking tendencies have 
been shown to predict quitting behavior and 
restless body movement in confinement, with 
or without sensory deprivation. Despite the 
restiveness of sensation seekers in a situa- 
tion of confinement, it is possible that they 
are attracted to such experiments because of 
the prevalent rumors that SD conditions pro- 
duce strange and bizarre sensations, for €x- 
ample, hallucinations. 

There have been few actual studies of 
volunteers for SD or hypnosis experiments 


2J. P. Zubek, personal communication, February 
1966. 
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which are good examples of “off-beat” types 
of research likely to draw from groups of 
volunteers with high sensation-seeking tend- 
encies. Martin and Marcuse (1958) found 
that combined male and female volunteers for 
hypnosis were high in intelligence and low 
on ethnocentrism. Since high sensation seek- 
ers, based on the SSS, seem to place excite- 
ment above conventional behavior, this result 
would also lead us to predict that volunteers 
for hypnosis would be high on the SSS. 
Lubin, Brady, and Levitt (1962) found 
that female volunteers for hypnosis experi- 
ments who actually appeared for their ap- 
pointments “have higher needs for exhibition 
and aggression, are less objective, and lower 
in friendliness, personal relations and need for 
order . . . [and] in the social value area 
[p. 342].” The authors also found the volun- 
teers higher than nonvolunteers and “non- 
showers” on a special Rorschach Dependency 
score (Levitt, Lubin, & Zuckerman, 1962). It 
might be noted that this dependency score 
was found to be unrelated to other indexes 
of dependency (Zuckerman, Levitt, & Lubin, 
1961) and either represents “unconscious de- 
pendency needs” or something else. Overtly, 
the volunteers seem to incorporate many hy- 
pothetical traits of the sensation seeker, in 
that they manifestly prize personal aggrandize- 
ment above order and friendly personal rela- 
tions. One of the items on the SSS asks 
whether the worst social sin is to be rude or 
to be a bore.* Being a bore is the most un- 
forgivable social sin to a sensation seeker. 


3 Other sample items from the Sensation-Seeking 
Scale: 
(4) A. I often wish I could be a mountain climber. 
B. I can’t understand people who risk their 
necks climbing mountains. 
(5) A. I dislike all body odors. 
B. I like some of the earthy body smells. 
(6) A. I get bored seeing the same old faces. 
B. I like the comfortable familiarity of every- 
day friends. i 
(9) A. I would not like to try any drug which 
might produce strange and dangerous effects 
on me. 
B. I would like to try some of the new drugs 
that produce hallucinations. 
(11) A. I sometimes like to do things that are a 
little frightening. 
B. A sensible person avoids activities that are 
dangerous. 
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Myers, Murphy, Smith, and Goffard + have 
compared volunteers and nonvolunteers for 
perceptual isolation on a number of tests and 
biographical data indexes. The population 
consisted of Army personnel. About three- 
quarters of a group of 551 men volunteered 
to undergo 96 hours of perceptual isolation. 
They were offered no monetary rewards, and 
the stated motives of the volunteers were: 
contributing to science, testing their reactions 
to stress, thinking about personal problems, 
planning for the future, or catching up on 
their sleep. The volunteers were generally 
volunteers for the Army rather than draftees, 
younger, higher on a combat aptitude test, and 
lower on MMPI Depression and Psychopathic 
Deviancy (Pd) than the nonvolunteers. The 
fact that nonvolunteers, rather than volun- 
teers, were higher on the Pd scale seems to 
argue against the hypothesis that volunteers 
for SD will tend to be high sensation seekers, 
The Pd scale contains a number of items re- 
flecting sensation-seeking tendencies. Kipnis 
(1966) found a .29 correlation between this 
scale and the SSS in a group of 150 naval 
recruits, Zuckerman ë found a .42 correlation 
between the Gough socialization scale, scored 
in the asocial direction, and the SSS in a 
group of 441 female undergraduates, These 
findings would lead to the prediction that 
sensation seekers would not tend to volunteer 
for an experiment in confinement or SD, 
while they might be more likely to volunteer 
for experiments in hypnosis. 

Schachter (1959) has hypothesized that 
firstborns have high needs for affiliation and, 
therefore, prefer to be with other subjects 
while anticipating a stressful experiment. If 
this hypothesis held up we would expect that 
fewer firstborns would volunteer for an experi- 
ment involving SD and social isolation. Capra 
and Dittes (1962) found a high proportion 
of firstborns (76%) among volunteers for 
an experiment in small group interactions. 
However, Suedfeld (1964) found an equally 
high proportion of firstborns (79%) among 


4T. I. Myers, D. B. Murphy, S. Smith, and S, J. 
Goffard, “Experimental Studies of Sensory Depriva- 
tion and Social Isolation.” Unpublished manuscript, 
HumRRO, United States Army Leadership Human 
Research Unit, Monterey, California, 1966. 

5M. Zuckerman, unpublished data, 1966. 
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volunteers for an experiment in SD, somewhat 
negating the affiliation-need interpretation of 
Capra and Dittes’ data. Schultz® found a 
lower percentage of females (58%) volunteer- 
ing for SD, but this percentage was exactly 
the same as the proportion of the firstborns in 
the population (13 introductory psychology 
classes) from which the sample was drawn. 

The findings in the literature lead to con- 
flicting predictions regarding volunteering for 
SD. While volunteers for psychological stud- 
ies tend to be unconventional and arousal 
seeking, the volunteers for the Myers SD 
study (see Footnote 4) were lower on these 
traits, insofar as they are measured by the 
MMPI Pd scale. However, the prediction for 
hypnosis is clear: Volunteers should tend to 
be high sensation seekers. 


Srupy 1 
Method 


The Ss for this study consisted of 127 female 
undergraduates at Mary Washington College of the 
University of Virginia. 

In one group of 57 Ss, volunteers were solicited 
for a 3-hour SD study after the conditions of SD 
were described. In another group of 70 Ss, volun- 
teers were requested for an experiment in hypnosis. 
In both groups, the SSS was administered following 
the volunteering. 

Both groups were analyzed for birth order. First- 
borns (only and early borns) were compared with 
later borns. 


Results 


Of the first group, 42% volunteered for an 
SD experiment, and 69% of the second group 
volunteered for a hypnosis experiment. Al- 
though a higher percentage volunteered for 
hypnosis than SD, the difference was not 
significant (x? = 2.89). There was a slightly 
lower incidence of firstborns in the volun- 
teers than in the nonvolunteers in both 
groups, but neither difference was significant 
(x? for SD = 1.78, x? for hypnosis = .81). 
Of the volunteers for SD, 46% were first- 
born, while 64% of the nonvolunteers were 
firstborn. Of the volunteers for hypnosis, 67% 
were firstborn, while 77% of the nonvolun- 
teers were firstborn. The data indicate no 
relationship between birth order and volun- 
teering for either type of experiment. 


ê D. Schultz, unpublished data, 1966. 
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TABLE 1 


COMPARISONS OF VOLUNTEERS AND NONVOLUNTEERS 
ON THE SENSATION-SEEKING SCALE 


(Stupy 1) 
Percent- sss | sss 
Group n | age of mu | SD t 
N 
Sensory deprivation 
Volunteers (V) |24 42 22.54 | 4.18 
Nonvolunteers |33 58 17.30 | 5.56 
(NV) 
V versus NV 4,00"** 
Hypnosis 
Volunteers (V) | 48 69 23.40 | 4.52 
Nonvolunteers | 22 31 16.23 | 5.92 
(NV) 
V versus NV 4.94" 


Note.—Abbreviated: SSS = Sensation-Seeking Scale. 
wD < .001, 


Table 1 gives the mean SSS scores for 
volunteers and nonvolunteers for each type of 
study. The volunteers for SD and the volun- 
teers for hypnosis experiments scored sig- 
nificantly higher than the nonvolunteers in 
each group on the SSS. The mean scores of 
the nonvolunteers are close to the mean of 
16.7 obtained in a group of 100 female under- 
graduates (Zuckerman et al., 1964). The 
scores in the volunteer groups represent sig- 
nificant deviations above this mean sensation- 
seeking score. 


Stupy 2 
Method 


The Ss for this study were 121 male under- 
graduates from introductory psychology classes at 
Villanova University. These Ss were tested in several 
classes. They were first given the SSS, followed by 
a questionnaire containing the Hypomania (Ma) 
and Taylor Manifest Anxiety scale (MA; Taylor, 
1953) from the MMPI (Hathaway & McKinley; 
1951). After taking these tests, the Ss were asked 
to signify whether they would like to be Ss in an 
SD and/or a hypnosis experiment. The Ss were 
told that they would be paid, but the rate of pay 
was not disclosed. They were also told that the 
time of the experiment would be fitted into their 
own schedules. The conditions of SD were described, 
and the Ss were asked to indicate if they would like 
to be in these experiments (yes), would not like 
to be (no), or might like to be depending °” 
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other factors (maybe). The Ss were also asked 
to indicate whether they would be interested in 
a maximum 3-hour, 8-hour, 24-48-hour, or 1-week 
SD session. Finally, the Ss were asked to indicate 
“yes” or “no” as to whether they would like to be 
in a hypnosis experiment. For purposes of analysis, 
the “yes” responders to SD were contrasted with 
the combined “maybe” and “no” groups. In the 
contrast of volunteers and nonvolunteers, the maxi- 
mum duration of SD was not considered. Another 
analysis was performed comparing the 3-hour, 
8-hour, 24-hour, and 1-week maximum duration 
volunteers. 


Results 


Table 2 shows the percentage of Ss volun- 
teering for both SD and hypnosis, SD only, 
hypnosis only, and neither experiment. The 
mean SSS scores and the F values derived 
from a 2 X 2 analysis of variance are also 
listed in this table. 

Volunteers were significantly higher than 
nonvolunteers for hypnosis on the SSS (p < 
.001). There was a near-significance tendency 
for the volunteers for hypnosis to be higher 
on the Ma scale (p < .10), but an examina- 
tion of the means indicated that this dif- 
ference was largely a function of the low Ma 
scores in the group which did not volunteer 
for either experiment. There were no signifi- 
cant differences between volunteers and non- 
volunteers for either experiment on the MA 
scale, although the volunteers for hypnosis 
had a somewhat higher mean than the non- 
volunteers. No differences were found on any 
of the three scales between volunteers for SD 
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and nonvolunteers. None of the interactions 
between volunteering and nonvolunteering for 
hypnosis and SD experiments approached sig- 
nificance. 

The 66 volunteers for SD were broken down 
into those who volunteered for a maximum of 
3 hours (V=5), 8 hours (N = 16), 24-48 
hours (N = 27), and 1 week (N= 18). 
Mean SSS scores showed a tendency to in- 
crease, and mean MA scale scores showed a 
tendency to decrease with increasing volun- 
teered duration of SD, but analysis of vari- 
ance on the three personality scales yielded 
low, insignificant F ratios between groups. 

The correlation between the three scales 
over all 121 Ss were: 

1. SSS versus Ma: r=.21, p< 05. 


2. SSS versus MA scale: r= —.01, ns. 
3. MA scale versus Ma: r= .58, p < 01. 


The Sensation-Seeking Scale was signifi- 
cantly related to the Hypomania scale, and 
the Hypomania scale was significantly related 
to the Taylor Manifest Anxiety scale, but 
there was no relationship between the Sensa- 
tion-Seeking Scale and the Taylor Manifest 
Anxiety scale. 


Stupy 3 


The difference between Studies 1 and 2 in 
the results relating sensation seeking and 
volunteering for sensory deprivation experi- 
ments could have been due to a number of 
differences, including: sex of the Ss, regional 


TABLE 2 
SCORES OF VOLUNTEERS AND NONVOLUNTEERS FOR EXPERIMENTS (STUDY 2) 
Volunteered for: 
Percentage! Mean Mean Mean 
romp i fN SSS Ma MA 
SD Hyp 
I Yes Yes 45 37 16.02 19.96 20,27 
I No Yes 21 17 15.71 19.90 19,29 
m Yes No 21 17 13.62 19.29 17.71 
Iy No No 34 28 12.41 17.88 17.79 
All Ss 121 100 14.44 19.26 18.77 
F*, V-SD versus NV-SD 1.11 <1 <1 
I, V-Hyp versus NV-Hyp 15.88" 2.95* 2.25 
Fs, Interaction <1 <i <1 


Hyp = hypnosis. 


Note.—Abbreviated: SSS = Sensation-Seeking Scale; V = volunteer; NV = nonvolunteer; SD = sensory deprivation; 
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TABLE 3 


COMPARISONS OF MALE AND FEMALE VOLUNTEERS AND 
NONVOLUNTEERS ON SENSATION-SEEKING 
Scare (Stupy 3) 


Males Females Both 


Group 
M SD M SD M SD 
Volunteers 21.61 | 5.25 | 17.26 | 4.59 | 19.74 | 5.42 
Nonvolunteers | 17.08 | 5.74 | 14.62 | 5.68 | 15.49 | 5.82 
t 2.440" 1,81* 3.62" 
*p <.10. 
Dp < 02. 
wok p < 001. 


differences (north versus south), religious dif- 
ferences (predominantly Protestant versus 
Catholic), and the fact that the SSS was given 
after requesting volunteers for SD experi- 
ments in Study 1 and before asking for volun- 
teers in Study 2. In order to better control 
these factors, Schultz conducted a third study 
at the University of North Carolina at Char- 
lotte. In this study both male and female Ss 
were drawn from the same university popula- 
tion. 


Method 


The Ss were 109 college sophomores (54 males 
and 55 females) enrolled in courses in general 
psychology. The Sensation-Seeking Scale was ad- 
ministered, and following this volunteers were 
requested for a study involving 3 hours of sensory 
deprivation. The conditions of SD were described. 


Results 


Forty-one of 54 males (76%) and 31 of 55 
females (56%) volunteered for an SD experi- 
ment. The means and standard deviations of 
the male, female, and combined groups on the 
SSS are given in Table 3. 

The male volunteers were significantly 
higher than the male nonvolunteers on the 
SSS. The female volunteers were also higher 
than nonvolunteers, but the difference was 
significant at the .10 level rather than at the 
.05 level. The difference for the combined 
group was highly significant (p < .001). 


Discussion 


In both Studies 1 and 2 volunteers for 
hypnosis experiments were found to score 
significantly higher on the SSS than non- 
volunteers. The results on volunteering for 
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sensory deprivation experiments were less con- 
sistent. The female volunteers in Study 1 
were significantly higher than nonvolunteers 
on the SSS (p < .001), and the same trend 
was seen for females in Study 3, but the dif- 
ference in Study 3 was of a lower order of 
significance (p < .10). Since Study 3 was a 
replication and a conservative two-tailed test 
was used, the authors tend to interpret the 
latter difference as significant. The SSS 
scores of the females at the University of 
North Carolina (M = 16.1) were significantly 
lower than those of females at Mary Washing- 
ton College (M = 19.5). Perhaps these dif- 
ferences in the college populations could ac- 
count for the differing significance levels of 
the difference between female volunteers and 
nonvolunteers. 

In studies of male volunteers and non- 
volunteers for SD experiments, the results are 
more discrepant. In Study 2, using males 
from Villanova University, there was clearly 
no difference on the SSS between volunteers 
and nonvolunteers. In Study 3, conducted at 
the University of North Carolina, volunteers 
were significantly (p < .02) higher on the 
SSS than nonvolunteers. Thus the differences 
between male volunteers and nonvolunteers 
for SD experiments seem to depend more on 
regional or other differences than do those 
of females. In the case of the males, the 
SSS scores were considerably higher at the 
University of North Carolina (M = 20.5) 
than at Villanova University (M = 14.4). 
The results from Villanova University are 
closer to the original SSS male norms (M = 
15.1) obtained at another northern school, 
Adelphi University. For both males and fe- 
males the larger differences on the SSS be- 
tween volunteers and nonvolunteers were 
found at the schools where the mean SSS 
scores were higher. The standard deviations 
were close in all groups. This raises the pos- 
sibility that the differences between volun- 
teers and nonvolunteers for SD experiments 
may depend on the presence of a significant 
number of high sensation seekers in the 
populations sampled. At any rate, population 
differences seem to be important in the sss, 
and further studies should be made to de 
termine the factors associated with them. 

In three of the four groups (females, Study 
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1; males, Study 2; females, Study 3; males, 
Study 3) sensation seeking was found to be 
associated with volunteering for SD experi- 
ments. In the two groups (females, Study 1; 
males, Study 2) where the relation between 
sensation seeking and volunteering for hyp- 
nosis experiments was studied, it was found 
to be significant. Other factors studied— 
hypomania, anxiety, and birth order—were 
not found to be related to volunteering. On 
the whole, the results support Schubert’s 
(1964) theory that volunteering for unusual 
experiments, in the absence of other strong 
incentives, is partly a function of a trait 
which he calls “arousal seeking” and we call 
“sensation seeking.” 
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AND 


This study investigated whether continuation of a patient in a comprehensive 
program of community-based psychiatric rehabilitation was related to his 
overall “style” of coping behavior. Early in the service program, 86 ex-mental- 
patients were independently rated by 5 to 7 professionals on a scale consisting 
of 7 categories (fearful, dependent, impulsive, socially naive, withdrawn, self- 
deprecatory, hostile). At the close of the program, they were classified into 3 
groups: Ss who completed their assigned rehabilitation program (Cs; N = 33) ; 
Ss who dropped out before completion (DOs; N=30); Ss who were ad- 
ministratively terminated (Ts; N=23). The results indicated that the coping 
scale was a reliable assessment device. Cs were judged as significantly less 
impulsive and socially naive and significantly more self-deprecatory than the 
other 2 groups combined; Ts differed from DOs only in that the former were 


significantly more socially naive. 


It has become abundantly clear that the 
recent and continuing shifts in the discharge 
policies of mental hospitals are bringing a 
host of new problems in their wake. Large 
numbers of mental patients are now leaving 
the hospital after an average length of stay 
of 2 or 3 months, as compared to the situa- 
tion a decade or so ago when it was con- 
siderably longer. A consequence, however, of 
the greatly accelerated discharge rates has 
been the proportionate rise of recommitment 
rates. As a result, the health and welfare field 
has been paying increased attention to the 
creation of complex networks of “aftercare” 
services, with the goal of maintaining the 
discharged patient in the community. 

The discharged mental patient faces nu- 
merous problems attendant upon transition 
from the hospital to the community. One of 
the things he is expected to do is to find some 
kind of remunerative employment. While 
having a job is apparently not a sufficient 
condition for remaining out of the hospital 


1 This study was carried out under research and 
demonstration Grant RD-990-p from the Vocational 
Rehabilitation Administration, United States Depart- 
ment of Health, Education, and Welfare, to the 
Institute for the Crippled and Disabled of New 
York City, It is one aspect of a large-scale study 
of the factors involved in the rehabilitation of the 
vocationally disadvantaged ex-mental-patient that 
is being conducted in cooperation with the New 
York State Division of Vocational Rehabilitation, 
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(cf. Freeman & Simmons, 1963), there is 
still a basis for supposing that it is one of 
the necessary conditions. It is for this reason 
that aftercare programs are increasingly in- 
corporating service features designed to help 
the ex-mental-patient make a vocational ad- 
justment. The present research arises from a 
large-scale study of the factors involved in i 
the vocational rehabilitation of the discharged 
mental patient. 

The general issue to which this research 
addressed itself concerns the emotionally 
disabled individual’s tolerance for treatment, 
as well as the community agency’s tolerance 
for the individual. Previous efforts to explore 
this problem have focused on the individual’s 
tolerance for continuation when treatment 
consists of casework, child guidance, Or psy- 
chotherapy (Levinger, 1960). It was the put 
pose of the current investigation to exten 
these explorations by studying both the ex- 
mental-patient’s and the agency’s tolerances 
in a comprehensive rehabilitation program 
consisting of psychological, social, and voca- 
tional services. More specifically, it asked 
whether people who are able to complete 
their assigned rehabilitation programs differ 
from those who drop out, as well as from 
those who are administratively terminate 
during their programs, y 

Based on clinical impressions formed i 
earlier studies of psychiatric rehabilitation 
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(cf. Gellman, Friedman, Gendel, Glaser, & 
Neff, 1957; Neff, 1959), it was thought that 
a relationship might exist between an indi- 
vidual’s ability to continue in a comprehen- 
sive rehabilitation program and his overall 
coping style. For the purposes of the present 
research, the latter was defined as the char- 
acteristic and predominating manner in which 
the individual responded to interpersonal pres- 
sures, Seven such coping styles were worked 
out through pilot studies of staff-patient 
interactions (see Table 1). A procedure was 
then developed through which persons could 
be rated on the degree to which one or an- 
other of these coping styles was predominant 
in his behavior. If reliable ratings could be 
made, subjects could then be classified by 
coping style. The resulting instrument, de- 
scribed in detail below, is called “the coping 
scale.” In the present paper, the basic hy- 
pothesis was that coping style is related to 
the ability of ex-mental-patients to maintain 
themselves in an intensive and long-term 
program of ameliorative treatment. 


METHOD 


The rehabilitation program required the pa- 
tient’s daily attendance at a rehabilitation facility 
from 9 am to 4 pm. This service program was 
divided into two phases (Neff & Koltuv, 1964). 
Phase 1 consisted of a 7-week evaluation period 
during which time patients were observed in a 
graded series of work-related situations of increasing 
difficulty and pressure. Where necessary, this period 
could be extended by 3 to 6 weeks, 

In most instances, this evaluation phase was 
sequentially conducted, first in a vocationally- 
oriented occupational therapy unit, then in a 
sheltered workshop setting, followed by a standard- 
ized job-sample assessment program. During Phase 1, 
patients also received approximately 20 hours of 
interviewing and psychometric testing by a team 
consisting of at least six kinds of professionals: a 
Social worker, a vocational counselor, a psychologist, 
a psychiatrist, an occupational therapist, and a 
Tesearch assistant. Depending on the case, the pa- 
tients might also be simultaneously involved in 
individual or group psychotherapy or encouraged to 
Participate in group work. Typically during this 
Period, patients interacted with approximately a 
dozen different staff members of the facility, as 
well as with a large number of rehabilitation clients 
with a wide range of disabilities, including many 
with visible physical impairments. 

Following this first phase, patients then entered 
hase 2 which consisted of an average of 26 weeks 
of classroom trade training or workshop skill train- 
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ing. During this time, patients received in accordance 
with their needs any of the following additional 
services: individual and group psychotherapy, in- 
dividual and group counseling, casework, and group 
work. Both the length and type of evaluation and 
training provided to each patient was tailored to ac- 
cord with the staffs judgment of that patient’s 
needs, 


Subjects 


The Ss involved in the present research were 
drawn from a pool of 99 ex-mental-patients who 
were referred to the rehabilitation facility by the 
State Division of Vocational Rehabilitation. Of these 
99 cases, 11 were sent into outside trade schools fol- 
lowing their Phase 1 evaluations, primarily be- 
cause the rehabilitation facility did not offer the 
recommended training course. These 11 cases were 
dropped from this study to maximize the homo- 
geneity of both the milieu that the patients were 
required to cope with and the tolerance exhibited 
for different kinds of patient behaviors. 

The remaining 88 individuals were subdivided in 
the following manner: At the point when the ser- 
vice phases of the project had ended, each patient 
was classified as to whether he had completed his 
assigned service program (C), had dropped out 
before completion (DO), or had been administra- 
tively terminated as “vocationally unfeasible” (TD, 
Of the 88 patients, 86 were capable of being so 
classified and constituted the sample for this study. 

Among the 86 Ss were 33 Cs, 30 DOs, and 23 Ts. 
Sixteen DOs terminated during or right after Phase 1 
evaluation, with the remaining 14 DOs leaving the 
program during the Phase 2 training sequence. Of 
the Ts, 9 were administratively terminated during 
or right after evaluation and 14 during training, 

The 86 Ss had a mean age of 30.2 years (SD= 
8.1 yr.); 56 were males and 30 were females, 
Seventy bore a diagnostic label of schizophrenia, and 
all were free from any disability other than a 
functional emotional disorder. Fifty-seven Ss had 
been hospitalized two or more times, 19 had one 
earlier hospitalization, and the remaining 10 had 
no previous hospital history, but had been in one 
or another kind of treatment situation. At the 
time of entrance into the rehabilitation program, 
only 11 were married. The mean educational attain- 
ment of the group was twelfth grade, 


The Coping Scale 


The instrument consisted of seven coping styles, 
each defined by an accompanying brief description 
(see Table 1). Raters were asked to rank these 
broad types of behavior in terms of their predomi- 
nance in the client’s make-up and rate each of 
these behaviors on a four-point scale: 1, very 
predominant; 2, somewhat predominant; 3, slightly 
predominant; 4, not predominant. The accompany- 
ing instructions requested raters to make their judg- 
ments “on the basis of the client’s behavior (how 
he acts and what he says) and not on the basis 
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TABLE 1 


CATEGORIES AND SUPPORTING DEFINITIONS 
OF THE COPING SCALE 


Fearful 


Among other things, this sort of individual may be 
tense, fidgety, jumpy, uneasy, may be frequently 
troubled or worried, may be afraid and timid in his 
relationship with others, may be afraid to establish 
contact with others, may seem mousy, may shy away 
from things and people. 


Dependent 


This kind of individual might give the appearance of 
being impotent in dealing with the world by himself. 
Among other things, he may frequently ask help from 
others, may rely on others for support, may be unable 
to initiate action on his own, may place himself in the 
position of making others direct him, may be highly 
compliant, may seek others’ approval. 


Impulsive 


Among other things, this sort of individual may 
rarely see a task through, may be unable to stick to a 
plan of action, may flit from one thing to another, may 
be unable to delay the gratification of his impulses, 
may immediately seek to satisfy his desires, may 
easily become enthusiastic about something and then 
rapidly lose the enthusiasm. 


Socially naive 


This kind of individual may be unperceptive when 
it comes to the needs or feelings of others, may not 
realize that his behavior elicits reactions from others 
or has an effect on them, may be socially inept, may 
not seem to know what is appropriate in ordinary 
social situations. 


Withdrawn, apathetic 


Among other things, this kind of individual may be 
bland, lethargic, may lack vitality, may give the 
impression of being indifferent to things going on 
around him, may lack emotional responsivity, may 
seem very easygoing and uninvolved. 


Self-deprecatory 


Among other things, this sort of individual may 
point up and willingly talk about his deficiencies, may 
be highly self-critical, may talk about his ineptitude, 
may derogate his qualities and abilities, may generally 
run himself down, may express self-doubts. 


Hostile 


Among other things, this sort of individual may be 
angry with others most of the time, may be subtly 
negativistic, may contradict and argue with others, 
may do things to irritate and annoy others, may be 
sarcastic, may belittle or insult others, may criticize 
others. 
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of inferences about underlying motivations and 
dynamics.” Each S was independently ranked and 
rated by the service staff members upon the com- 
pletion of his first 2 weeks in the program. The 
completed material was then filed away and was 
not available to the service staff at the later time 
when the subject pool was subdivided into Cs, Ts, 
and DOs. 

Five to seven staff members judged each S, 
depending upon the number of professionals with 
whom he was in contact during his first 2 weeks in 
the program. There was an average of 5.8 judges 
per client. Since there was an unequal number of 
raters across Ss, the intraclass coefficient of cor- 
relation (Haggard, 1958) was used to determine 
the reliabilities of both the rankings and the ratings. 
The focus of the reliability study was upon the 
interjudge agreement on each of the behavior cate- 
gories (impulsive, dependent, etc.) across Ss, and 
was computed for the total group of 99 individuals. 
Since the interjudge reliabilities for the ratings were 
substantially higher than those for the rankings, 
only the ratings on the seven behavior categories 
were subsequently used in this study. 

Table 2 presents means, standard deviations, and 
intraclass Rs for the average of the ratings for each 
of the seven behavior categories. The scoring of | 
each category was such that the lower scores in- | 
dicated a higher degree of predominance of the be- 
havior in question. The reliabilities ranged from 56 
to .79 and were deemed to be moderately high, 
since the intraclass R provides an approximation 
of the squared coefficient of reliability obtained 
by the Pearson r (Cronbach, Rajaratram, & Gleser, 
1963). 


RESULTS 


Using the mean rating for each of the 86 
subjects on each scale category, a two-way 
analysis of variance was performed to ex- 
amine differences between and within subjects 
by status group (C, T, DO), sex, and scale 
category. Table 3 presents the mean of the 
ratings for each scale category by sex within 
status groups, and Table 4 presents a sum- 


TABLE 2 
Means, STANDARD DEVIATIONS, AND INTRACLASS, 
BASED ON RATINGS MADE By FIVE TO 
SEVEN JUDGES ON 99 SUBJECTS 


Rs 


Statistic | F | D 1 |sn| w|spD]|# 
Gas 

M 2.10 | 2.12 | 2.49 | 2.65 | 2.67 | 2.79 |286 

SD 54) 48| .67| -69| .77| 58) o 

Intraclass} .60 | .56| .63| .78| .79| .64 

R 


j sim 
Note.—Abbreviated: F = fearful; D = dependent) I Teic; 
pulsive; SN = socially naive; W = withdrawn, aP 
SD = self-deprecatory; H = hostile. 


es Ct 
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TABLE 3 
Mean RATINGS ror COPING CATEGORIES, BY Status Groups AND SEX OF SUBJECTS 

Group F D I SN W SD H 
Completers 

Male 1.96 2.08 2.64 2.74 2.45 2.63 2.93 

Female 2.22 2.30 2.70 2.86 3.01 2.59 2.98 
Terminators 

Male 2.08 2.06 2.26 2.47 2.65 3.14 2.59 

Female 1.70 1.43 2.43 1.83 2.40 2.75 2.95 
Dropouts 

Male 2.18 2.04 2.44 2.69 2.70 2.79 2.89 

Female 2.04 2.20 2.32 2.72 2.84 2.80 2.68 

All Ss 2.07 2.07 2.48 2.63 2.68 2.79 2.83 


Note,—N = 86, Abbreviated:F = fearful; D = dependent; I = impulsive; SN = socially naive; W = withdrawn, apathetic; 


SD = self-deprecatory; H = hostile. 


mary of the findings from the analysis of 
variance. Two of the F tests were significant: 
the main effect for typology categories (C) 
and the Status Group X Typology Category 


(A X C) interaction. 


A Duncan multiple-range test was per- 
formed on scale category means across subjects 
in order to further define the significant main 
effect. The results indicated that there was a 
significant difference between dependent and 


dominating trait, was significantly different 
from self-deprecatory and hostile, which rep- 
resent the least predominating traits. 

Table 5 presents the results of a series of 
orthogonal comparisons which were designed 
to study the sources of the Scale Category x 


TABLE 5 


ORTHOGONAL COMPARISONS OF COPING 
CATEGORIES FOR STATUS GROUPS 


fearful, taken together, and all the other 
scale categories. Reference to the bottom line Source df MS F 
of Table 3 indicates that these two character- 5 
ets SS is x f Fearful 
istics are the most predominating traits o C versus T + DO 1 007 | <1 
the sample. Impulsive, the third most pre- T versus D 1 “300 | <1 
Dependent ir fen 
C versus T + DO 1 6: i 
pora T versus D 1 424 1.372 
ANALYSIS OF VARIANCE SUMMARY Impulsive 
C versus T + DO 1 | 2.058 6.659** 
Source df | MS F T versus DO 1 111} <1 
Naive 
Between Ss C versus T + DO 1 1,729 5.595** 
Status group (A) 2 | 1.38 2.091 T versus DO 1 | 2.049 6.631** 
Sex (B) 1 eal eh Withdrawn 
AXB 2 | 1.75 2.651* C versus T + DO 1 013 | <1 
Error (between) 80 66 T versus DO 1 336 1,087** 
Self-deprecatory 
Within Ss C versus T + DO 1 | 1.821 5.895" 
Coping category 6 | 8.78 28.322*** T versus DO 1 .780 2.524 
(C) Hostile i 
AXC 12 66 2.129** C versus T + DO 1 .806 2.608 
BXC 6 .26 | <1 T versus DO 1 .219 | <1 
AXBXC 12 .42 1.355* 
Error (within) 480 31 Error 480 309 
—N = 86. Abbreviated: C = completer; T = ter- 
AR oak: minator DO = dropout i 


** p < 1025. 
RD < 001. 


**p < 025. 
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Predominance 


Fic. 1. Mean ratings by status group and all subjects (N = 86). 


Status Group interaction. In these compar- 
isons, the completers (Cs) were contrasted 
with all subjects who did not complete the 
program (Ts + DOs); within subjects who 
did not complete the program, the Ts were 
contrasted with the DOs. 

Examination of Table 5, with reference to 
the mean scores summarized in Table 3, in- 
dicates the following. The significant Scale 
Category X Status Group interaction was 
mainly related to status group differences on 
three of the seven categories: (a) Cs were 
significantly less impulsive (p < .025) than 
Ts and DOs combined, while Ts did not differ 
from DOs on this category; (b) Cs were sig- 
nificantly less socially naive than noncom- 
pleters (p < .025), and Ts were significantly 
more naive than DOs (p < .025); and (c) 
Cs were judged significantly more self- 
deprecatory than noncompleters (p < .025), 
with no significant difference on this cate- 
gory between Ts and DOs. 

The interaction of category and group may 
be seen graphically in Figure 1. The inter- 
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Withdrawn 


action appeared to be largely a function of 
the opposing profiles of the Cs and Ts. The 
former tended to be less predominating om 
each typology category than the latter, wi 
the exception of self-deprecatory, in whi 
the order is reversed. The DO group tended 
to be midway between the former two groups 
and was close to the means for the entire 
sample, 


Discussion 


In considering the obtained results, the 
reader must be reminded that the judges wi? 
produced the ratings were also instrument 
determinants of the client’s continuation 
the program. To an indeterminate extent, 
therefore, the early ratings and the late 
staff decisions may be confounded. An a 
judgment that a client was predominating 
hostile, for example, might so influence a 
staff member’s attitude to the client that te 
client might decide to leave the progam 
Although precautions were taken to ma 


e 
the early ratings unavailable to the servic 
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staff, they could have been remembered to a 
variable extent. In theory, it is possible to 
think of the determinants of a service out- 
come as being made up of two kinds of com- 
ponents: those comprising the client’s actual 
behavior and those comprising a profes- 
sional’s opinions about the client’s behavior. 
In practice, however, these two sets of com- 
ponents tend to be confounded, since the 
people who are in the best position to judge 
the client’s behavior are also in the best 
position to influence it. This is a familiar 
dilemma of all service-oriented research, one 
from which the present study could not 
escape. In another sense, we had no wish to 
escape it, since it can be assumed that one 
of the determinants of client continuation is 
the attitude of the professional staff. 

Given that the above data represent an 
indeterminate mix of client behavior and staff 
judgment, it would appear that the ability 
of the ex-mental-patient to sustain a compre- 
hensive program of rehabilitation is related 
to judgments concerning his characteristic 
coping behavior. Specifically, Figure 1 sug- 
gests that there is an overall tendency for 
patients who do not sustain such a program 
(T+ DO) to exhibit greater amounts of 
traits which, by definition, seem maladaptive. 
The statistically salient attributes of the 
“good patient,” that is, the program com- 
pleter, are less impulsivity and social naiveté 
and greater self-deprecation. 

The finding with regard to impulsivity 
accords with studies which report that in- 
dividuals who continue in psychotherapy 
are judged less impulsive than those who 
break off treatment (Rubenstein & Lorr, 1956; 
Taulbee, 1958). The impaired ability of 
the socially naive individual to maintain 
himself in the rehabilitation program may 
stem from the fact that the services provided 
in the current instance required, by their 
content and their philosophy, high doses of 
continued and relatively constructive inter- 
action with others. Individuals characterized 
by greater interpersonal ignorance and in- 
Sensitivity were more likely to experience dis- 
comfort and punishment in coping with these 
social demands. 

An interesting finding is that program sus- 
tainers are more likely to manifest self- 
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deprecatory qualities. Psychotherapy studies 
suggest that patients who continue in therapy 
express greater personal dissatisfaction and 
feelings of inadequacy than those who drop 
out (Lorr, Katz, & Rubenstein, 1958; Ruben- 
stein & Lorr, 1956; Taulbee, 1958). It may 
well be that the elevated self-deprecation 
found in the present study represents a more 
realistic self-assessment and a willingness to 
discuss and deal with personal problems and 
inadequacies. 

Of the three status groups in the present 
study, Figure 1 indicates that clients who 
are administratively terminated can generally 
be characterized as having a greater pre- 
dominance of certain dysfunctional behavior 
characteristics. The service staff’s apparent 
lower tolerance for this kind of client may 
occur because these maladaptive behaviors 
are unacceptable and threatening in their 
own right and are seen as representing major 
obstacles to the helping process. The behavior 
profiles for the dropouts appear midway be- 
tween those of the completers and of termi- 
nated clients. Although possessing somewhat 
more of the traits in question than the 
completers, this group by itself is not dis- 
tinctive enough in its judged behavior to 
provide a basis for understanding its program 
status, 

In summary, the coping scale proved to be 
a relatively reliable and useful means of ac- 
counting for some of the variance within the 
project sample. The results tended to bear out 
certain earlier clinical impressions and sup- 
port the possibility of using the global and 
habitual manner in which people cope with a 
set of social demands as an indication of their 
ability to sustain treatment. In other papers, 
now in preparation, the coping-scale data 
are related to a wide range of other measures 
and to a series of outcome criteria obtained 
through a follow-up study of the research 
sample. 
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EFFECT OF IRRELEVANT PERIPHERAL VISUAL STIMULI 
ON DISCRIMINATION LEARNING IN MINIMALLY 
j BRAIN-DAMAGED CHILDREN? 


ROBERT MITCHELL BROWNING 2 
University of Mississippi 


The hypothesis that minimally brain-damaged children are more distractible 
than non-brain-damaged children was tested in a series of 3 experiments, 
Experiment 1 demonstrated that the distracting condition of peripheral visual 
stimuli interfered with discrimination learning in non-brain-damaged Ss. In 
Experiment 2, the distracting condition again interfered with learning in non- 
brain-damaged Ss, but failed to have this effect with brain-damaged Ss. When 
differences in IQ were controlled statistically, the performance differences be- 
tween brain-damaged and non-brain-damaged groups were no longer sig- 
nificant. The distracting condition did not interfere with learning in a group 
of older brain-injured Ss used in Experiment 3. The results failed to support 
the hypothesis that a task-irrelevant distracting condition of peripheral visual 
stimuli would affect the performance of brain-injured Ss more than that of 


non-brain-injured Ss. 


This experiment was designed to test the 
hypothesis that task-irrelevant visual stimuli 
would interfere with learning in a hetero- 
geneous group of minimally brain-damaged 
children and that the effect would be greater 
than that in a group of non-brain-damaged 
children of comparable age and intelligence. 
The hypothesis was predicated on the basis 
of the Strauss and Lehtinen (1947) hy- 
pothesis that distractibility is a major be- 
havioral consequence of cerebral injury in 
children. Distractibility, defined as a mani- 
festation of uncontrollable hyperresponsive- 
ness to external stimuli (Strauss & Lehtinen, 
1947, p. 130), has been considered instru- 
mental in the learning problems of brain-in- 
jured children (Cruickshank, Bentzen, Ratzel- 
burg, & Tannhauser, 1961; Strauss & 
Lehtinen, 1947). The conceptualization of 
distractibility as a behavioral characteristic 
of brain-damaged children has received wide- 
spread acceptance and corroborating clinical 
reports (Anderson, 1963; Bakwin, 1949; 
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mitted to the University of Mississippi in partial 
fulfillment of the requirements for the degree of 
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Benton, 1962; Bradley, 1955; Bradley, 1957; 
Burks, 1960; Clements & Peters, 1962; Den- 
hoff, Laufer, & Holden, 1959; Eisenberg, 
1964; Ingram, 1956; Paine, 1962). 

A review of the literature revealed an ab- 
sence of controlled experimental studies deal- 
ing specifically with the relationship between 
irrelevant visual stimuli and learning in the 
minimally brain-injured child. For such a 
study to be conducted, several procedural 
requirements must be satisfied. For example, 
according to Strauss’ formulations (Strauss 
& Kephart, 1955; Strauss & Lehtinen, 1947), 
the behavioral consequences of brain injury 
in children diminish with increasing age. 
There are numerous reports in the literature 
which conclude that behaviors such as hy- 
peractivity and distractibility, which are ob- 
served to occur in childhood, decrease and 
often disappear with onset of adolescence 
(Cromwell, Baumeister, & Hawkins, 1963; 
Laufer, 1962; Laufer, Denhoff, & Solomons, 
1957). Thus, a test of Strauss’ hypothe- 
sis would require school-age preadolescent 
children. 

Although the parameters of intelligence for 
the minimally brain-damaged child have not 
been established, it has been suggested that 
the group to which Strauss refers overlaps the 
lower half of the distribution of normal chil- 
dren and the upper half of retarded children 
(Elkind, Koegler, Go, & Van Doornick, 1965). 
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A test of Strauss’ hypothesis would necessi- 
tate selection of a sample of brain-damaged 
children whose mean intelligence level was 
average. According to the hypothesis of 
Strauss and Lehtinen (1947), brain injury in 
children has relatively uniform effects. Since 
the hypothesis of distractibility is not related 
to such variables as locus, kind, extent, and 
duration of injury, the appropriate experi- 
mental design to test the hypothesis would 
be random assignment of subjects to known 
distracting and nondistracting conditions, 
rather than grouping subjects by specific 
evidence of organicity. 

The study was divided into three experi- 
ments. The first experiment was to deter- 
mine if the distracting stimulus condition, as 
developed from earlier pilot studies, did inter- 
fere with discrimination learning in a group 
of children without brain damage. The second 
experiment was to compare the performance 
of young minimally brain-injured children 
with that of children without brain injury, 
both with and without the distracting stim- 
ulus condition. The third experiment was to 
determine if the distracting stimulus con- 
dition had the same effect with older mini- 
mally brain-injured children as it did with the 
younger group. 


EXPERIMENT 1 
Method 


Subjects, Sixty-six Caucasion children without brain 
damage, ranging in age from 58 to 112 months and 
in grade placement from preschool to third grade, 
were randomly assigned to either the experimental 
(NW =35) or control (V=31) group. All experi- 
mental groups in the study learned the discrimina- 
tion task in the presence of task-irrelevant peripheral 
visual stimuli; control groups learned the task with- 
out the distracting condition. The Peabody Picture 
Vocabulary Test (PPVT), Form A, was used to 
obtain the MA and IQ for Ss in all three experi- 
ments. The groups in Experiment 1 were com- 
parable on the variables of CA, MA, and IQ. 

Apparatus. The apparatus was a console with 
three plexiglass windows which S could press to 
indicate his choice on each trial of the discrimina- 
tion task. A receptacle centered above these windows 
delivered candy-corn reinforcements for correct 
responses, The S was partially surrounded by three 
wall panels which were attached to the console. The 
side and ceiling panels each housed a battery of 
multicolored lights which flashed at various fre- 
quencies and functioned as the peripheral visual 
stimulus condition. During the control condition 
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these light batteries were concealed. The apparatus 
was covered with accoustical matting to reduce 
extraneous sounds. There was no visual contact 
between Æ and S during the learning trials. 

The discrimination task materials were 17 cards, 
each having the stimulus figures of a circle, square, 
and triangle. The cards were ordered so that the 
triangle, which was the discriminative stimulus, 
was never of the same color (orange, green, or blue) 
or position under the three windows on any two 
successive trials. 

Procedure. The S was informed that his task was 
to learn which was the correct picture to select to 
operate the candy machine. The same 17 stimulus 
cards were repeated successively until the criterion 
of 10 consecutively correct responses was attained. 
Every correct response was reinforced with candy 
corn. 


Results 


The mean trials to criterion and respective 
standard deviations for the experimental and 
control groups are shown in Table 1. The F 
ratio revealed that the experimental group was 
significantly more variable than the control 
group on the measure of trials to criterion 
(F = 3.116, df = 34/30, p < .01). In testing 
for difference between the means of the ex- 
perimental and control groups, the modified ¢ 
test for samples of unequal Ws with hetero- 
geneity of variance was used (Edwards, 
1960). It was found that the experimental 
group required significantly more trials to 
reach criterion than the control group (re- 
quired ¢ .05 = 2.03; obtained ¢ = 2.22, df= 
34/30, p < .05, two-tailed). 

Experiment 1 demonstrated that peripheral 
flashing visual stimuli interfered with the 
non-brain-damaged subjects’ learning of the 
discrimination task. It was, therefore, decided 
that the experimental condition would be 4 
satisfactory one to test the effect of irrelevant 
visual stimuli on the learning of minimally 
brain-damaged children. 


TABLE 1 


MEAN TRIALS TO CRITERION FOR EXPERIMENTAL AND 
CONTROL GROUPS IN EXPERIMENT | OF 
Non-Bratn-DAMAGED CHILDREN 


Group N M no. trials SD 
Experimental 35 79.71 72.15 
Control 31 47.61 40.79 
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EXPERIMENT 2 
Method 


Subjects. Fifty-four Caucasian minimally brain- 
damaged children, enrolled in eight “Level I Per- 
ceptually Handicapped Classes,” and 54 Caucasian 
non-brain-damaged children randomly selected from 
first- through fourth-grade regular classes in the 
Memphis city school system were used in this 
experiment. All Ss were randomly assigned to either 
experimental or control groups. 

Children placed in the perceptually handicapped 
classes are identified on criteria patterned directly 
after those proposed by Strauss and Lehtinen (1947). 
The curriculum of the classes is also in accordance 
with these authors’ recommendations. The definition 
of the population of brain-injured children, as in- 
corporated by the Special Education Division of 
the Memphis city school system, can be obtained 
from a planning committee report (Boone, Mackey, 
Jordan, Boone, & Perry, 1965). Information from 
the psychological testing and physical examination 
and available developmental history are used by 
the Special Education Division to determine if the 
evidence is in agreement with Strauss’ description of 
the brain-injured child. It is also recommended that 
the parents arrange for neurological and EEG ex- 
aminations for their child. The children are re- 
evaluated periodically to ascertain if there has been 
a reduction of those behaviors which initiated the 
original referral and a sufficient advancement of 
their achievement level of permit a return to the 
normal class setting. Medication decisions rest with 
the family and its physician; Æ had no control 
over these decisions nor access to such privileged 
information. Few Ss were known to be receiving 
medication. The only control over possible inter- 
action between drug effects and task performance 
was by random assignment of Ss, and heterogeneous 
evidence of organicity further necessitated random 
assignment to groups rather than matching groups 
on demonstrable evidence of cerebral deficit. It 
should be emphasized that the hypothesis under 
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consideration is that minimally brain-damaged chil- 
dren, as a group, are more distractible than normals. 

Analysis of variance indicated that the brain- 
damaged and non-brain-damaged experimental and 
control groups did not differ significantly on mean 
CA or MA. The F value for the interaction term in 
the analysis of variance on IQ among these groups 
was significant (F=4.47, df=52, p< .05). Sub- 
sequent t-test comparisons between groups on mean 
IQ indicated that the brain-damaged control group 
was significantly lower than the brain-damaged 
experimental group (t= 2.07, df = 52, p < .05, two- 
tailed), lower than the non-brain-damaged experi- 
mental group (t=3.04, df=52, p<.01, two- 
tailed), and lower than the non-brain-damaged 
control group (¢=3.62, df=52, < 01, two- 
tailed). None of the other possible comparisons 
between groups reached significance. 

Procedure. The apparatus and experimental pro- 
cedure were identical to those described in Ex- 
periment 1. 


Results 


The means and standard deviations for 
trials to criterion for the experimental and 
control groups are shown in Table 3. Bart- 
lett’s test for homogeneity of variance (Ed- 
wards, 1960) yielded a corrected B score of 
15.31, which is significant (p < .01) when 
entered into the table of chi-square at df = 3, 
In consideration of Lindquist’s (1956) discus- 
sion of the Norton study, in which it was 
demonstrated that even marked heterogeneity 
of variance has a small effect on the F dis- 
tribution, it was decided to proceed with an 
analysis of variance of these data. The re- 
sultant F value for the interaction term was 
significant (F = 6.77, df = 1/104, p < .05). 

These results warranted making compari- 
sons between groups, with particular em- 


TABLE 2 


Mean CA anp MA 1N Monrus AND IQ IN BRAIN-DAMAGED AND NoN-BRAIN-DAMAGED 
EXPERIMENTAL AND CONTROL Groups IN EXPERIMENT 2 


CA MA IQ 
Group 
M SD M SD M SD 
Brain-Damaged 
Experimental 108.67 11.81 113.66 31.85 102.48 17.61 
Control 106.93 18.46 97.55 32.15 92.19 16.99 
Non-Brain-Damaged 
Experimental 101.67 11.84 110.19 23.25 105.22 13.71 
Control 101.48 11.94 113.93 24.50 108.00 14.38 


Note.—N = 27 in each group, 
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TABLE 3 


MEAN TRIALS TO CRITERION FOR BRAIN-DAMAGED AND 
NonN-BRAIN-DAMAGED EXPERIMENTAL AND 
CONTROL GROUPS IN EXPERIMENT 2 


M no. 


Group N trials SD 
Brain-Damaged 
Experimental 27 52.56 41.16 
Control 27 74.41 52.99 
Non-Brain-Damaged 
Experimental 27 57.26 37.41 
Control 27 40.64 23.49 


phasis given to the nature of the interaction. 
Since the groups were not homogeneous in 
respect to their variances, the error terms for 
the ż-test comparisons were derived from the 
two groups under comparison rather than 
from the error term obtained in the analysis 
of variance. When the variances of the ex- 
perimental and control groups of non-brain- 
damaged children were compared, a signifi- 
cant F ratio resulted (F = 2.99, df = 26/26, 
p < .01). Even when the 52 degrees of free- 
dom were halved to df = 26, the ¢ test be- 
tween the means of these two groups was sig- 
nificant (t= 2.156, df = 26, p< .05, two- 
tailed). The results demonstrated that the 
experimental condition did interfere with the 
non-brain-damaged groups’ learning of this 
particular task, as it had in Experiment 1. 

There was no significant difference between 
the means of the experimental and control 
groups composed of minimally brain-damaged 
children (¢ = 1.66, df = 52, p> .05, two- 
tailed). It is interesting to note that although 
the brain-damaged experimental and control 
groups were not significantly different, the 
difference between the groups was in the 
opposite direction from that observed to occur 
in the non-brain-damaged groups. 

The brain-damaged and non-brain-damaged 
experimental groups did not differ in mean 
trials to criterion (¢ = .414, df = 52, p > .05, 
two-tailed). In comparing the brain-damaged 
and non-brain-damaged control groups, it was 
necessary to reduce by one-half the degrees 
of freedom in the ¢ test, since their variances 
were not homogeneous (F = 5.01, df = 26/26, 
p < .01, two-tailed). Since the brain-damaged 
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control group was significantly lower on mean 
IQ than the other groups, an analysis of 
covariance was performed to control statisti- 
cally for the variable of IQ. The test for 
homogeneity of within-cell regression was 
not significant, indicating that the analysis 
of covariance was an appropriate statistical 
test for these data. 

The analysis of covariance indicated that 
when a linear adjustment was made for the 
effect of variation due to differences in IQ 
there were no differences between the brain- 
damaged and non-brain-damaged experi- 
mental and control groups on mean trials to 
criterion (F = 3.79, df = 1/103, p > .05). 


EXPERIMENT 3 
Method 


Subjects. Twenty-two Caucasian Ss from two 
“Level II Perceptually Handicapped” classes were 
selected for the experiment. The same selection 
procedures were used in identifying these children 
as for the “Level I Perceptually Handicapped” 
classes described in Experiment 2. Experimental and 
control groups were formed by random assignment. 
Table 4 contains the means and standard deviations 
for CA and MA, in months, and IQ (PPVT, Form 
A) for this older group of minimally brain-damaged 
children. 

The experimental and control groups were not 
significantly different in mean CA and MA (CA: 
t=1.91, df=20, p> .05, two-tailed; MA: t= 1.23, 
df=20, p> .05, two-tailed). The groups did differ 
in IQ (t=2.63, df =20, p < .05, two-tailed). $ 

Apparatus. The apparatus described in Experi- 
ment 1 was used. The discrimination task differed 
from that in Experiments 1 and 2. The same three 
figures again varied in color, but also in size. Twenty- 
seven stimulus cards were arranged in a series s0 
that the triangle, which was the discriminative stimu- 
lus, was never in the same size, color, and position 
combination on any two successive trials. The series 
of cards was repeated until the criterion of 10 con- 
secutively correct responses was attained. 


TABLE 4 


Mean CA anv MA IN Montas AND IQ FOR 
OLDER BRAIN- DAMAGED EXPERIMENTAL AND 
CONTROL GROUPS IN EXPERIMENT 3 


cA MA 19 
Group 
s sD 
M |sp| M |sD| M = 
Experi 0. 
Experimental | 142,09 | 19.88 | 145.27 | 20.76 | 102.27 1 
Control 156.45 | 12:98 | 134.64 | 17.91 | 91.09] 85° 


Note.—N = 11in both groups, 
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Procedure. The experimental procedure was the 
same as used in Experiments 1 and 2. 


Results 


The purpose of Experiment 3 was to in- 
vestigate the effect of the experimental condi- 
tion on older brain-damaged children; their 
performance was not compared with a group 
of non-brain-damaged subjects. The ¢ test for 
differences in mean trials to criterion was not 
significant (¢ = 1.46, df = 20, p > .05, two- 
tailed). 


Discussion 


The purpose of the study was to test the 
hypothesis that minimally brain-damaged 
children would require more trials to learn 
a three-choice discrimination task in the 
presence of increased peripheral stimulation. 
Experiment 1 established the distracting con- 
dition as one which had a significant detri- 
mental effect on learning the task in non- 
brain-damaged children. The interfering ef- 
fect of the flashing, multicolored lights was 
verified by replication with the non-brain- 
damaged children involved in Experiment 2. 
Tt was expected that brain-damaged children 
would require more trials to learn the dis- 
crimination task than would normal subjects 
in the absence of irrelevant visual stimuli 
(Lehtinen, 1960). This latter expectancy was 
supported by the significant difference in mean 
trials to criterion between the non-brain- 
damaged and brain-damaged control groups. 
However, the hypothesis that task-irrelevant 
visual stimuli would interfere with learning in 
brain-damaged children was not supported; 
there was no significant difference in mean 
trials to criterion between the experimental 
and control groups of young brain-damaged 
children. 

It was found that when differences in in- 
telligence level were controlled statistically 
through analysis of covariance, there were no 
significant differences in mean trials to cri- 
terion between the experimental and control 
groups composed of brain-damaged and non- 
brain-damaged children. Perhaps intelligence 
is more relevant to distractibility than the 
diagnostic label of minimal brain damage. 

Although it was found that a distracting 
peripheral visual stimulus condition did not 
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interfere significantly with the brain-damaged 
subjects’ performance, it would be an un- 
warranted generalization of these data to 
conclude that brain-damaged subjects are less 
responsive to distracting conditions on all 
sense modalities than non-brain-damaged sub- 
jects. Furthermore, the experimental condi- 
tions of the study were unique; the distracting 
stimuli were not background stimuli incor- 
porated within the discrimination learning 
task, but were located on the subject’s visual 
periphery. Perhaps the cogent finding is that 
visual distractibility as a behavioral correlate 
of brain injury in children remains an open 
question. 
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PREDICTING INSTITUTIONAL AND POSTRELEASE 


ADJUSTMENT OF D 


ELINQUENT BOYS 


JAMES E. COWDEN awp ASHER R. PACHT 
Wisconsin Department of Public Welfare, Madison 


152 consecutive 1st admissions to a co: 


rrectional school for delinquent boys 


were assessed in order to develop and compare the predictive efficiency of 5 
multiple-regression equations. Those equations showing significant predictive 


efficiency included those predicting: (a) 


institutional adjustment, (b) the 1st 


3 mo. of postrelease adjustment, (c) time on parole until revocation, and 


(d) time on parole until discharge. A 
personality factors best predicted insti 
prognostic rating based primarily upon 


global prognostic rating based upon 
tutional adjustment, while a global 
family background factors best pre- 


dicted postrelease adjustment. Ratings from parole agents’ reports were most 
predictive of time on parole until revocation and of time on parole until dis- 
charge from parole status. This study also demonstrated a technique for using 


multiple-regression results to segregate 
differing supervision and treatment needs, 
making day-to-day decisions about them, 


This largely exploratory study had two 
purposes: first, to select from among a large 
number of variables those which best pre- 
dicted the institutional and postrelease ad- 
justment of delinquent boys; second, to de- 
vise a means whereby the results of multiple- 
regression analyses could be translated into a 
form directly usable by institutional staff to 
classify newly committed delinquent boys as 
to their probable institutional and postre- 
lease adjustment. Selection of variables as 
potential predictors was based in part upon 
the results of earlier research. Hathaway and 
Monachesi (1953) and Peterson, Quay, and 
Cameron (1959) focused upon personality 
variables as predictors. Reckless (1955) 
studied age as a predictor variable, and 
Weeks (1943) and Glueck (1950) assessed 
the relationship of the boy’s home environ- 
ment to his postrelease adjustment. Glueck 
(1945) also studied both the seriousness of 
the delinquents’ offenses as a predictor vari- 
able and the relationship of institutional ad- 
justment to postrelease adjustment. Addi- 
tional variables were included as a result of 
earlier work in this area by Cowden (1966). 


METHOD 


The sample used in this study was comprised of 
152 consecutive first admissions to the Wisconsin 
School for Boys in Wales, Wisconsin. These Ss 


1 We would like to thank the superintendent and 
staff of the Wisconsin School for Boys, Wales, for 
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delinquent boys into subgroups with 
, facilitating the use of such results in 


were randomly divided into construction and cross- 
validation samples of 76 boys each. The predictive 
efficiency of the following predictor variables was 
assessed: age in months at the time of first admis- 
sion; IQ as obtained from the Otis or Lorge-Thorn- 
dike intelligence tests; ratings by experienced psy- 
chologists or psychiatrists of hostility, anxiety, guilt, 
dependency, personality disturbance, maturity, social 
delinquent features, and home environment; ratings 
by the Catholic and Protestant chaplains of re- 
ligious involvement; scale scores from a modified 
version of the Buss-Durkee Hostility-Guilt Inven- 
tory (HGI; 1957) measuring hostility, anxiety, guilt, 
parental identification, dependency, and religious in- 
volvement; scale scores on Berdie and Layton’s 
(1957) Minnesota Counseling Inventory (MCI) 
measuring defensiveness, family relationships, social 
relationships, emotional stability, control, reality 
testing, mood, and leadership; behavioral ratings by 
cottage counselors on the variables of hostility, 
anxiety, guilt, dependency, delinquent identifications, 
relationships with peers, relationships with adults, 
and social skills. 

Four additional predictor variables and three of 
the five criterion variables were rated independently by 
two psychologists from case-filé material to assess 
the reliability of these ratings. The predictor varia- 
bles included ratings of the seriousness of the offense 
leading to the boy’s commitment on a 20-point scale 
ranging from truancy to murder, and three prog- 
nostic ratings based upon reports by psychologists or 
psychiatrists, social workers, and cottage counselors, 
Three of the criterion variables used were ratings 
of institutional adjustment derived from social 
workers’ prerelease reports, and two ratings of 
postrelease adjustment based upon parole agents’ 
chronological reports detailing the boys’ adjustment 


their cooperation and able assistance in gathering 
the data for this study. 
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for the first 3 months and the first 2 years after 
release from the institution. These three ratings were 
used as criterion variables in Regression Equations 
1-3 (Tables 1-3) and subsequently as predictor 
variables in Regression Equations 4-5 (Tables 4-5). 
Two additional criterion variables were used: time 
on: parole until revocation (40 boys in the construc- 
tion sample whose parole was revoked at some 
point during the 2-year follow-up period were in- 
cluded), and time on parole until discharge (44 boys 
from the construction sample who were discharged 
from parole during the 2-year follow-up period 
were included). 

Multiple-regression equations were obtained from 
the construction sample to determine which variables 
were most predictive of each of the five criterion 
variables in turn. The 8 coefficients produced were 
then applied to the cross-validation sample by mul- 
tiplying them by each S’s scores on the predictor 
variables for each multiple-regression equation. The 
sum of the products thus obtained from each mul- 
tiple-regression equation was the “expected” cri- 
terion score for each S in the cross-validation sample. 
Frequency distributions of those expected criterion 
scores for each criterion variable were subdivided 
into thirds representing low, medium, and high sub- 
groups, and cutoff scores were obtained. The per- 
centage accuracy of classification of Ss into each of 
the three subgroups was then measured by com- 
paring the degree of agreement in classification using 
expected versus actual criterion scores. Mean actual, 
as opposed to expected, criterion scores of these 
three subgroups were calculated, and the significance 
of the differences between the means of the low, 
medium, and high subgroups were compared for 
each of the five criterion variables in turn. This pro- 
cedure not only provided an assessment of the pre- 
dictive efficacy of each of the five multiple-regression 
equations, but also had the added advantage of 
being immediately translatable, with a minimum of 
calculations, into a usable form by institutional staff 
personnel. 


RESULTS 


The reliabilities of the variables independ- 
ently rated by two psychologists were found 
to be moderately high, with reliability co- 
efficients ranging from .70 to .84. 

The seven variables found to be most 
highly predictive of the boys’ rated institu- 
tional adjustment are shown in Table 1, 
ranked in order of their relative predictive 
weights (8 coefficients). These results in- 
dicate that a positive global prognosis derived 
from reports by clinicians (psychologists or 
psychiatrists) based mainly on personality 
variables was the single best predictor of a 
positive institutional adjustment. The mul- 
tiple-correlation coefficients using these seven 
predictor variables were .74 and .65, re- 


spectively, for the construction and cross- 
validation samples. Next, the “expected” 
institutional adjustment scores of the cross- 
validation group were used to categorize 
them into low, medium, and high subgroups. 
The cutoff scores for the three subgroups 
(on a seven-point scale) were 2.6 and 5.3, 
and the percentage accuracy of classification 
was 52%, 37%, and 58% for the low, me- 
dium, and high subgroups, respectively. The 
mean actual institutional adjustment scores 
for each of these subgroups were 3.12, 3.63, 
and 4.58, respectively. There was thus a 
progressive increase in mean institutional 
adjustment score over the three subgroups. 
The means of the medium and high sub- 
groups differed significantly from one another 
beyond the .05 level. This multiple-regression 
equation thus produced a clear differentiation 
between two of the three subgroups of boys 
in the cross-validation sample. 

The eight variables found to be most 
highly predictive of the boys’ rated first 3 
months of postrelease adjustment are shown 
in Table 2, ranked in order of their relative 
predictive weights. These results indicate 
that a positive global prognosis derived from 
social workers’ reports (based mainly on 
family background factors) was the single 
best predictor of a positive short-term post- 
release adjustment. The multiple-correlation 
coefficients (using these eight predictor vari- 
ables) were .45 and .37, respectively, for 
the construction and cross-validation samples. 
The cutoff scores for the three subgroups (on 
a seven-point scale) were 3.3 and 5.3, and 


TABLE 1 


INSTITUTIONAL ADJUSTMENT AS RATED FROM SOCIAL 
Workers’ FINAL PRERELEASE REPORTS: 
REGRESSION EQUATION ONE 


Predictor variable cae 
Clinical service global prognosis 500 
(clinician) 
Relationship with adults (counselor) 311 
Social skills (counselor) —.267 
Religious involvement (chaplain) 210 
Guilt (clinician) «189 
Counselor prognosis (counselor) —.129 
Anxiety (counselor) —.026 
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the percentage accuracy of classification was 
44%, 50%, and 43% for the low, medium, 
and high subgroups, respectively. The mean 
scores of the low, medium, and high sub- 
groups in actual postrelease adjustment 
(categorized upon the basis of their expected 
scores) were 2.56, 2.68, and 3.40, respectively. 
The difference between the medium and high 
subgroups was significant beyond the .05 
level. This regression equation thus differenti- 
ated clearly between two of the three sub- 
groups in the cross-validation sample. 

The eight variables found to be most pre- 
dictive of the boys’ first 2 years of post- 
release adjustment are shown in Table 3, 
ranked in order of their relative predictive 
weights, These results indicate that a positive 
global prognosis derived from social workers’ 
reports was the best single predictor of a 
positive long-term postrelease adjustment. 
The multiple-correlation coefficients (using 
these eight predictor variables) were .55 and 
-41, respectively, for the construction and 
cross-validation samples. 

The cutoff scores for the three subgroups 
(on a seven-point scale) were 2.4 and 5.0, 
and the percentage accuracy of classification 
was 23%, 44%, and 56% for the low, me- 
dium, and high subgroups, respectively. The 
mean postrelease adjustment scores of the 
low, medium, and high subgroups (cate- 
gorized upon the basis of their expected 
Scores) were 3.15, 2.15, and 3.76, respectively. 
In this case, the multiple-regression equation 
was not very effective in separating the three 


TABLE 2 


PostRELEASE ADJUSTMENT AS RATED FROM PAROLE 
AGENTS’ First 3 MONTHS CHRONOLOGICAL 
REPORTS: REGRESSION EQUATION Two 


Predictor variable eN ee 
Social worker global prognosis 356 
(social worker) 
Anxiety (HGI) —.176 
Anxiety (counselor) .122 
Anxiety (clinician) —.104 
Mood (MCI) 087 
Personality disturbance (clinician) —.079 
Relationship with adults (counselor) pd 


Hostility (HGI) 
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TABLE 3 


PosTRELEASE ADJUSTMENT AS RATED FROM PAROLE 
AGENTS’ REPORTS UNTIL DISCHARGED FROM 
PAROLE OR FOR A MAXIMUM OF 2 YEARS: 
REGRESSION EQUATION THREE 


i i 8 
Predictor variable Fitch 
Social worker global prognosis 560 
(social worker) 
Dependency (clinician) —.356 
Relationship with peers (counselor) — 312 
Dependency (HGI) —.284 
Guilt (HGI) .216 
Anxiety (HGI) —.200 
Religious involvement (chaplain) —.154 
Mood (MCI) 148 


subgroups into categories showing a progres- 
sive increase in mean postrelease adjustment 
scores. The difference between the mean of 
the medium subgroup (which had the lowest 
mean score of the three) and that of the 
high subgroup was significant, however, be- 
yond the .01 level. 

The four variables found to be most highly 
predictive of time on parole until revocation 
are shown in Table 4, ranked in order of their 
relative predictive weights. These results in- 
dicate that a positive postrelease adjustment 
rating based upon the first 3 months of the 
parole agents’ chronological reports was the 
best single predictor of length of time on 
parole until revocation. The multiple-correla- 
tion coefficients (using these four predictor 
variables) were .51 and .55, respectively, for 
the construction and cross-validation samples. 
The cutoff points for the three subgroups, on 
a scale measuring days between release from 
the institution and revocation, were 87 days 
and 228 days, and the percentage accuracy 
of classification was 53%, 43%, and 47% 
for the low, medium, and high subgroups, 
respectively. The mean lengths of time (in 
days) spent on parole until revocation of the 
low, medium, and high subgroups (cate- 
gorized upon the basis of their expected 
scores) were 98.7, 225.9, and 272.8, respec- 
tively. The difference between the means of 
the low and medium subgroups was significant 
beyond the .01 level. This multiple-regression 
equation thus effectively differentiated be- 
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TABLE 4 


TIME ON PAROLE IN Days UNTIL REVOCATION: 
REGRESSION EQUATION FouR 


Predictor variable Agile t 
Parole agent 3-month rating of 50.5 
postrelease adjustment 
Guilt (clinician) —18.4 
Anxiety (counselor) —17.2 
Seriousness of prior offenses =14.1 


tween two out of the three subgroups of 
subjects in the cross-validation sample. 

The four variables found to be most highly 
predictive of time on parole until discharge 
from parole are shown in Table 5, ranked in 
order of their relative predictive weights. 
These results indicate that a positive post- 
release adjustment rating based upon the 
first 3 months of the parole agents’ chronolog- 
ical reports was correlated negatively with 
length of time on parole until discharge. The 
multiple-correlation coefficients (using these 
four predictor variables) were .50 and .55, 
respectively, for the construction and cross- 
validation samples. The cutoff points for the 
three subgroups, on a scale measuring days 
between release from the institution and dis- 
‘charge from parole, were 473 days and 726 
days, and the percentage accuracy of classifi- 
cation was 57%, 60%, and 64% for the low, 
medium, and high subgroups, respectively. 
The mean length of time on parole (in days) 
until discharge from parole of the low, me- 
dium, and high subgroups (categorized upon 
the basis of their expected scores) was 495.0, 
545.0, and 706.4, respectively. The difference 


TABLE 5 


TIME ON PAROLE IN Days UNTIL DISCHARGE FROM 
PAROLE : REGRESSION EQUATION FIVE 


4 3 B 
Predictor variable coefficient 
Parole agent 3-month rating of —62.83 
postrelease adjustment 
Anxiety (clinician) —16.50 
Mood (MCI) 13.86 
Parole agent 2-year report of 10.22 


postrelease adjustment 
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between the means of the medium and high 
groups was significant beyond the .05 level, 
This multiple-regression equation was thus 
effective in differentiating clearly between two 
of the three subgroups in the cross-valida- 
tion sample. 


Discussion 


The results indicated that a global prog- 
nosis based principally upon personality 
factors best predicted institutional adjustment, 
while a global prognosis based primarily upon 
family background factors best predicted post- 
release adjustment. These results clearly dem- 
onstrate the value of the clinician in integrat- 
ing various data for making a predictive judg- 
ment which is more accurate than any ob- 
tained directly from the individual measures, 
The following personality and behavioral 
rating factors were also predictive of institu- 
tional and postrelease adjustment: The higher 
a boy’s anxiety rating, the more negative was 
his institutional and postrelease adjustment; 
the higher his dependency rating, the more 
negative was his postrelease adjustment only. 
The higher a boy’s score on the MCI scale 
measuring depressive features, the more posi- 
tive was his postrelease adjustment; the higher 
his guilt rating, the most positive was his 
institutional and postrelease adjustment. The 
ability to relate positively to adults, as rated 
by cottage counselors, proved to be an im- 
portant behavior-rating variable in predicting 
both a positive institutional and postrelease 
adjustment. The more religious involvement 
shown by a boy, on the other hand, the more 
positive his institutional adjustment and the 
more negative his postrelease adjustment. 
Perhaps a higher degree of religious involve- 
ment functions as a curb upon delinquent be- 
havior only if reinforcing agents are close at 
hand, or if supplemented by institutional 
controls. 

The more positive the parole agents’ re- 
ports of the subjects’ adjustment up to 3 
months after release from the institution, the 
longer the time until revocation, and the 
shorter the time until discharge from parole, 
Hence, these chronological reports functione 
quite effectively as a means of making at- 
curate predictions about length of time unti 
revocation or release, A criticism might be 
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made of using ratings derived from parole 
agents’ chronological reports as predictors of 
revocation and discharge from parole since, 
of course, the agent plays a part in such 
decisions. However, most revocation decisions 
in particular are made on the basis of a new 
offense by the parolee. Unless the new offense 
is quite minor, the parole agent in almost all 
cases has no recourse except to revoke. The 
agent has somewhat greater flexibility in 
deciding whether or not to revoke in the 
case of rule violations, but even here the cri- 
teria for revocation are quite specific, par- 
ticularly in the case of more serious rule 
violations. The parole agent can recommend 
early release for those who have a good 
adjustment, but here again the agent does 
not make the actual decision. Hence, the 
ratings derived from the parole agent’s 
chronological reports appear to function, 
though with some reservations, as predictors 
which are reasonably independent of the cri- 
terion variables. In addition, for practical 
purposes of predicting potential revocation or 
discharge decisions (sometimes occurring 
many months in the future), these results 
strongly suggest that parole agents’ chron- 
ological reports merit close study. The results 
also indicated, interestingly enough, that the 
more guilt and anxiety displayed by in- 
dividuals within the institution, the shorter 
their length of time on parole until revoca- 
tion. One explanation is that delinquent 
activities serve as a defense against guilt and 
anxiety and, as a result, may actually mo- 
tivate acting-out behavior in the community, 
where there are fewer external controls. 

The second major purpose of this study, 
that is, providing a means of translating 
multiple-regression results, with a minimum 
of calculations, into a form directly usable 
by institutional staff members, was accom- 
Plished for four of the five regression equa- 
tions, These regression equations functioned 
Satisfactorily in segregating boys into low, 
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medium, and high subgroups, showing pro- 
gressively higher mean criterion scores, The 
results suggest that these equations can now 
be used as screening devices to segregate boys 
into subgroups differing in supervision or 
treatment needs. 

In general, the equation predicting long- 
term postrelease adjustment did not function 
as well as the others, A subsequent review of 
selected postrelease reports on subjects used 
in this study suggested that difficulties in 
predicting long-term criteria of this type are 
heightened because uncontrolled and un- 
anticipated situational factors significantly 
altered postrelease adjustment. One of the 
critical factors appears to be the kind of ex- 
ternal controls available to the boy. A study 
should be designed to evaluate the signifi- 
cance of these situational factors on post- 
release adjustment. 
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TEST INTERFERENCE IN A RORSCHACH-WAIS 
ADMINISTRATION SEQUENCE 


J. THOMAS GRISSO anb ARNOLD MEADOW 1 


University of Arizona 


The study was designed to test the hypothesis that scores on certain WAIS 
subtests would be lowered when presented immediately after the Rorschach. 
College students, who had taken a short form of the WAIS, were divided into 3 
groups matched on the variables IQ, sex, and scores on 3 WAIS verbal subtests. 
Each of the 3 groups was given 1 of 3 treatments immediately preceding 
readministration of the subtests: the associative phase of the Rorschach, a 
modified administration of the Bender-Gestalt Test, or no preceding test. The 
hypothesis was confirmed—subtest difference scores between pre- and posttest 
administration of the WAIS subtests for the group receiving the Rorschach 
were significantly below those of the 2 control groups. 


The problem of the effect of sequence of 
different types of tests on test scores has 
received scant attention in the research 
literature. Acknowledging the need for such 
research, L’Abate (1964) has noted that our 
knowledge of serial effects in test batteries is 
practically nil. 

L’Abate (1964) suggests a three-stage 
sequence for test batteries: (1) simple, in- 
nocuous tests (Draw-A-Person Quality Scale, 
DAP; Bender-Gestalt); (2) more complex 
and informative tests (WAIS, MMPI); (3) 
tests liable to create more anxiety (Rorschach, 
TAT). Thus, he reasons that tests in Stage 
2 will not be affected by anxiety aroused by 
tests in Stage 3. Rapaport, Gill, and Schafer 
(1945) mention that the Rorschach is usually 
not given as the first test in a battery, but 
they offer no specific rationale for this prac- 
tice. Brown (1958) takes an opposite posi- 
tion. He advocates a sequence which pro- 
gresses with relation to the degree of intensity 
of the interaction between the patient and the 
examiner as a result of the nature of the 
tests. He places the Wechsler-Bellevue (W-B) 
after the Rorschach and TAT in this se- 
quence, and he analyzes incorrect responses 
in the W-B with relation to projective data 
obtained in the previous tests. He does not 
discuss the possibility of test interference. 
Piotrowski (1958) also administers the 
Rorschach before the W-B (DAP or House- 


1 The authors are indebted to Nelson F. Jones of 
the University of Arizona for his helpful suggestions 
and criticisms, 


Tree-Person Projective Technique, HTP; 
Rorschach; then W-B Similarities and Com- 
prehension). He believes that to present the 
W-B subtests (which he describes as formal 
and impersonal) before the Rorschach would 
exert an inhibitory influence upon the pa- 
tient’s imagination, thus yielding a “Jess 
meaningful” Rorschach record. 

In addition to clinical opinions, a few 
empirical research studies have been reported, 
Gibby, Stotsky, and Miller (1954) found 
that among 100 neurotics, the scores of a 
group given the Rorschach alone did not 
differ significantly, with respect to 11 
Rorschach scoring variables and a number of 
content categories, from the scores of other 
subjects receiving the Rorschach following 
any one of four other tests: Bender-Gestalt, 
TAT, W-B, or Goldstein-Scheerer Tests of 
Abstract and Concrete Thinking. Analyses 
of the variances of the 11 scoring variables 
under these five conditions yielded F ratios 
very close to 1, none even approaching 56° 
nificance. If one may use these 11 formal 
scoring categories as criteria for that which 
makes up a meaningful Rorschach, then it 
would appear that Piotrowski’s (1958) spect” 
lation concerning the inhibitory effects of the 
formal and relatively structured Wechsler- 
Bellevue upon the Rorschach is unwarranted. 

Test interference has been noted in at Jeast 
two other studies, however. Berkun an 
Burdick (1964) observed that among 4 group 
of normal subjects given the Rosenzwet 
Picture-Frustration Study (PF) after 
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TAT, the mean score of extrapunitive 
hostility on the PF was significantly greater 
(and the score on intrapunitive hostility sig- 
nificantly lower) than that for another group 
who received the PF prior to the TAT. The 
authors offer no explanation for their findings. 
Cassell, Johnson, and Burns (1962) gave 
the HTP, Wechsler-Bellevue II (short form), 
and the reading, spelling, and arithmetic 
portions of an achievement test to six groups 
of normal subjects. Each group received a 
different sequence of the three tests. Se- 
quence had no significant effect on mean 
scores for any of the tests. Of the six W-B II 
group means, however, two of the three lowest 
mean scores occurred in the only two se- 
quences wherein the W-B II followed the 
HTP, 

To summarize, the literature presents con- 
tradictory suggestions. One author favors 
administering the WAIS prior to the Rors- 
chach, while two others take an opposite 
position. One experimental study shows no 
interference on the Rorschach by any one of 
four preceding tests, but two other studies 
suggest that relatively unstructured tests 
may interfere with performance on subse- 
quent tests. 

The present study was designed to provide 
additional information with respect to the 
problem of test sequence by studying the pos- 
sible effects of the administration of the 
Rorschach upon a subsequently administered 
WAIS. 

The theoretical rationale for hypothesizing 
that test interference may occur is based on 
the nature of the tests themselves and the 
thought processes involved in the production 
of responses on the tests. One formulation 
would predict that the Rorschach, as an un- 
Structured test stimulus, might produce more 
regressive, autistic, and impulsive thinking 
(Holt & Havel, 1960) which may carry over 
to subsequent performance on the WAIS. 
The WAIS scores might then be impaired, 
since the WAIS requires more selective, 
adaptive, reality-oriented responses for opti- 
Mal performance (Schafer, 1954). A second 
formulation would postulate that the inter- 
ference effect might be produced by anxiety 
aroused by the Rorschach experience. The 
relative contributions of these factors will be 
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evaluated subsequent to the presentation of 
the results of the present study. 

It is hypothesized that if the Rorschach, an 
unstructured and projective test, immediately 
precedes administration of the WAIS, a very 
structured test, then test interference will 
occur which will lower scores significantly 
on the structured test. 


METHOD 
Subjects 


The Ss were 60 college students (36 females, 24 
males) enrolled in introductory psychology classes 
at the University of Arizona. All Ss were volun- 
teers. The students were told that volunteers must 
attend two testing sessions, that the first session 
would include a short intelligence test, that they 
would receive their estimated IQ scores at the 
end of the second testing session, and that all test 
scores would be strictly confidential. Prospective Ss 
were asked not to volunteer if they had taken an 
intelligence test within the last 3 years. 

Class standings of the Ss were as follows; 29 
freshmen, 21 sophomores, 7 juniors, 2 seniors, and 
1 unclassified. Ages ranged from 18 to 49 years 
(M = 20.63 years), and IQ scores ranged from 98 
to 140 (M = 115.6, SD = 8.64). 


Preliminary Session 


All Ss were seen individually by one E over a 
period of 2 weeks. The Digit Span, Comprehension, 
Similarities, Picture Completion, and Block Design 
subtests of the WAIS were administered first, Be- 
fore leaving, S was informed that he would be 
contacted in his psychology class at a later date to 
schedule a time for his second session. 

The three verbal subtests (presented again fol- 
lowing the Rorschach or control conditions) were 
chosen on the basis of their assumed susceptibility 
to interference via the proposed phenomena stated 
in the rationale for the hypothesis of this study. 
Specifically, it was believed that anxiety aroused 
by the Rorschach experience would impair con- 
centration required for good performance on the 
Digit Span subtest. The Comprehension subtest is 
probably the least structured of the WAIS verbal 
subtests and was chosen for its sensitivity to im- 
pulsive and autistic responses which might occur 
following the Rorschach. The choice of the Similari- 
ties subtest was based upon the assumption that 
Rorschach-induced primitive thinking, being con- 
crete in nature, might inhibit an abstract-conceptual 
level of thinking required for good performance 
on this subtest. Picture Completion and Block 
Design scores were used only to contribute to IQ 
score, and were not further analyzed nor used in 
obtaining the main results of this study. 

After all 60 Ss had been seen, E scored the pro- 
tocols and recorded each S’s raw scores on the five 
subtests. Estimates of IQ were calculated by con- 
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verting raw scores to scaled scores, dividing the 
sum of an S’s five scaled scores by five, multiply- 
ing this quotient by the number of subtests in the 
WAIS, and referring to the WAIS manual to find 
the corresponding IQ score. IQ scores calculated 
with the pentad used here have been shown to yield 
a correlation of .964 with Full Scale WAIS score 
(Maxwell, 1957). 


Treatment Groups 


The Ss were then matched as closely as possible 
with relation to five variables, so that 20 matched 
triplets were formed. Each S in a triplet was 
randomly assigned to one of three treatment groups, 
to form three relatively well-matched groups of 
20 Ss each. 

The five matching variables used were sex, IQ, and 
raw scores on the Digit Span, Comprehension, and 
Similarities subtests. Perfect matching was not 
possible (except for sex) due to the sample size, 
extent of variability of scores, and the number of 
variables used. Nevertheless, except for Ss scoring 
at the extremes of the distribution of scores, any 
S was within six points on IQ, two points on Digit 
Span, four points on Comprehension, and four 
points on Similarities from either of the other two 
Ss in the triplet. 

Mean IQs of the three groups formed by match- 
ing procedures were almost identical: 115.75, 115.65, 
and 115.40. To test for accuracy of matching, 
analyses were made of the variances of each of the 
three WAIS subtests for the three groups formed 
by matching procedures, All three Fs were non- 
significant; thus the matching was judged to be 
satisfactory. 


Treatment Session 


The Ss were once again seen individually, in the 
same room, and by the same E, as during the 
preliminary session. Times between the preliminary 
and treatment sessions ranged from 20 to 35 days. 
Each S was seen whenever his schedule would 
allow, regardless of the group to which he had 
been assigned. Each S was once again administered 
the Digit Span, Comprehension, and Similarities 
subtests of the WAIS; at the end of the session 
he was given his estimated IQ score, as well as a 
short discourse on its meaning and interpretation. 


Differential treatment of the three groups was 
as follows: 


Group X. This group was first given the as- 
sociative phase of the Rorschach using all 10 cards. 
The Ss received instructions suggested by Klopfer 
and Davidson (1962, p. 28), and they were not 
questioned concerning their responses. Times for the 
associative phase ranged from 11 to 18 minutes. 
Immediately after the last response to Card 10, 
the Ss received the three WAIS subtests. 

Group Cı. This group first received the Bender- 
Gestalt, then the three WAIS subtests. The Bender- 
Gestalt was chosen as a control test because it 


J. Tuomas Grisso AND ARNOLD MEADOW 


was judged to be relatively innocuous (Bender, 
1946). In addition, when S had completed the draw- 
ings, he could be asked to draw them again when 
they were presented upside down or at some other 
angle. By continuing the task in this way until § 
had worked well into the range of the time re- 
quired for the associative phase of the Rorschach, 
the treatment for Group Cı offered a control for the 
time variable and its possible effects upon perform- 
ance on the WAIS subtests. 

Group Cs. This group received the three WAIS 
subtests only and represented, essentially, a control 
for recall and practice effects. 

The WAIS protocols were scored by Æ without 
knowledge of the group to which any protocol 
belonged, and raw scores were recorded. 


RESULTS 


Difference (D) scores were computed for 
each subject on each subtest by subtracting 
his original subtest score (raw score) from 
his score on the same subtest subsequent to 
the experimental or control treatment. The 
effect of the Rorschach upon performance on 
any WAIS subtest was indicated by the dis- 
crepancy between the mean D score for 
Group X and those for Groups C; and Cs. 

Three one-way analyses of variances of D 
scores were computed to assess the effects 
among groups on each subtest individually. 
The F ratios from this computation, as well 
as the mean D scores of each group on the 
three subtests, appear in Table 1. Each test 
yielded Fs which were highly significant 
(Digit Span: p[F = 8.35] < .001; Compre- 
hension: p[F = 11.72] < .001; Similarities: 
pLF = 5.85] < .01). The results of Schelfé 
tests on all possible pairs of means indicated 
that the significant results for all three sub- 
tests were attributable to the difference be 
tween the means for Group X and those of 
both the control groups; Groups Ci and Cs 


TABLE 1 


MEANS OF DIFFERENCE SCORES ON THREE wals 
SUBTESTS AND RESULTS OF ANALYSES 
OF VARIANCE 


WAIS Group | Group | Group F 
subtest X Ci C: Eoi 
aa +t 
Digit Span —115| 35 | —-10 AR 
Comprehension | —.60 1.35 1.15 585" 
Similarities —.20 1.50 1.35 i 


Test INTERFERENCE IN RorscHacH-WAIS ADMINISTRATION 


did not differ significantly from each other 
on any subtest (p> .25, on all subtests). 

On all subtests the mean pre- to post- 
treatment difference scores for Group X were 
well below those for the control groups, as 
can be seen in Table 1. Whereas control 
group means were all positive, except for that 
of Group C, on Digit Span, means for Group 
X were all negative. The subjects taking the 
Rorschach before the WAIS subtests, then, 
showed performance on all subtests which 
was significantly inferior to that of the sub- 
jects in the control groups. 


Discussion 


The influence of the Rorschach upon WAIS 
performance was apparently considerable for 
some subjects. Though the mean decreases of 
the experimental group on the three WAIS 
subtests were not impressive, over half the 
experimental subjects showed decreased 
scores, D scores for these subjects ranging 
from —1 to —4. Since this might amount to 
a decrease of from —1 to —4 WAIS scaled 
scores, these results alone would seem to have 
implications concerning sequence of tests 
in test batteries. 

Though the methodology of this study 
necessarily makes this test session dissimilar 
in many respects to clinical testing situations, 
the results still afford valuable theoretical and 
practical implications. The study by Gibby 
et al. (1954) showed that several other stand- 
ard psychological tests do not influence scor- 
ing variables on a subsequent Rorschach; the 
present study indicates that the reverse se- 
quence does produce test interference, at least 
when the subsequent test is a structured one. 
It should be noted, however, that the two 
studies used different populations. Gibby 
worked with neurotics and this study used a 
normal college sample. Nevertheless, the re- 
sults generally support L’Abate’s (1964) type 
of sequence and his rationale for it, but sug- 
gest a sequence different from those of the 
type reportedly used by Brown (1958) and 
Piotrowski (1958). 

In the formulation of the experiment, two 
Possible explanations for the expected inter- 
ference effect were presented. A third ex- 
planation, which was not stated as a rationale 
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for the hypothesis, would hold that the 
Rorschach might produce a persisting lower 
level of motivation, since it presents to the 
subject no clear criteria for successful per- 
formance. The results of this study do not 
support any one of the formulations as the 
sole explanation for interference, nor was 
the study designed to do so, but some 
speculations may be made concerning their 
relative contributions to the obtained effect. 

The experimental effect was obtained on 
all three subtests including the one admin- 
istered last, the Similarities subtest. The 
three proposed explanations may be evalu- 
ated with respect to their probable ability to 
persist through the last subtest in spite of the 
fact that presentation of the first two subtests 
would have begun to undo whatever inter- 
ference factor or factors had been generated 
by the Rorschach. That regressive, impulsive 
thought-process tendencies could have per- 
sisted this long is questionable, but the con- 
cept cannot be rejected on these grounds. The 
reduced motivation explanation is weakened 
as an important persisting interference factor, 
since the first two subtests would probably 
have renewed the level of motivation. On an 
a priori basis, the anxiety interpretation 
would appear to be the most plausible, since 
it is certainly conceivable that anxiety 
aroused in the Rorschach situation could 
persist over the period of time required for 
the three subtests. 

Some further empirical test of the anxiety 
hypothesis is provided by comparing the 
Rorschach protocols of the seven experi- 
mental subjects whose total WAIS difference 
scores indicated least interference (mean total 
D score = 1.14) with the seven subjects 
whose total difference scores indicated most 
interference (mean total D score = —5.00). 
Six Rorschach variables proposed as indexes 
of anxiety (Phillips & Smith, 1953) were 
used in the comparison, no particular ra- 
tionale dictating the choice of anxiety indexes 
(number of: responses, popular responses, M, 
m, and cloud responses, and 4%). The 
Rorschach records of the two groups were 
partially analyzed and their means on the six 
Rorschach scoring variables computed. Scor- 
ing in some cases was only approximate, 
since no inquiry was available. Subjects show- 
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ing the most interference on the WAIS pro- 
duced more constricted and conventional 
Rorschach protocols, in general, than did 
those showing less interference. The directions 
of the means indicated somewhat more anx- 
iety among high interference subjects in the 
cases of all six indexes, but it should be 
emphasized that the cases were too few for 
significance tests. Degree of anxiety, then, 
may have been one important factor respon- 
sible for interference with WAIS performance. 
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PERSONALITY CORRELATES OF TOLERANCE FOR 
UNREALISTIC EXPERIENCES * 


ALAN FEIRSTEIN 
Yale University 


Tolerance for unrealistic experiences (TUE) describes a person’s capacity to 
perceive in ways which contradict usual modes of perception. This study of 
20 graduate students was designed to investigate personality factors which 
might be related to TUE. Psychoanalytic theory suggested that ability to 
engage in both unrealistic and drive-related thinking that was integrated with 
more realistic, neutral, socialized thought should relate positively to TUE. 
Amount and integration of unrealistic and drive-related thought were measured 
by the Holt primary-secondary process scoring of the Rorschach test, on a 
word-association test, and on an art-preference test. Results indicated that 
TUE related to the capacity to engage in both integrated unrealistic and in 
integrated drive-related thought. These results were discussed in terms of their 
relevance for the understanding of personality factors involved in rigidity and 


in creativity. 


The concept of “tolerance for unrealistic 
experiences” (TUE) was first proposed by 
Klein and Schlesinger (1951) to describe a 
person’s capacity to perceive in ways which 
contradict usual modes of perception. The phi 
phenomenon is an example of one task which 
they used to assess TUE. In this task the 
subject was shown two stimuli and told that 
they would be flashed in sequence. Percep- 
tion of a single moving object directly con- 
tradicted the subject’s knowledge that, in 
reality, two stationary figures were being 
flashed in sequence; hence, ability to see phi 
was considered an appropriate measure of 
TUE. 

TUE seems to play an important role in 
personality functioning. For example, the 
ability to perceive in unusual and unrealistic 
ways has been discussed in a number of 
theories as a necessary aspect in the early 
stages of creativity (Kris, 1952; Schactel, 
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1959). Inability to perceive unusual and un- 
realistic experiences has been linked with 
rigidity in social beliefs (Frenkel-Brunswik, 
1949) and with “neurosis” (Hamilton, 1960). 
The present study investigates personality 
factors which relate to the capacity to per- 
ceive the unrealistic or unusual phenomena 
involved in TUE tasks. 

Two previous studies (Klein, Gardner, & 
Schlesinger, 1962; Klein & Schlesinger, 1951) 
investigated the relationship between un- 
realistic thinking on the Rorschach test and 
TUE. The prediction was that people who 
showed high TUE on the other perceptual 
tasks would engage in a large amount of 
fantasy and unrealistic thinking on the Rors- 
chach test. These two studies, however, did 
not investigate a variable that psychoanalytic 
theory suggests is of crucial importance in 
determining TUE—that is, the capacity of 
the subject to give unrealistic material that 
is integrated with more usual and logical ele- 
ments (Kris, 1952). Indeed, a preliminary 
study for the present research suggested that 
amount of unrealistic material given on the 
Rorschach test did not relate to performance 
on TUE tasks unless the integration of such 
material with more realistic thinking was also 
taken into account. The present study was 
designed to investigate the relationship be- 
tween TUE and quantified measures of 
amount and integration of unrealistic material 
on the Rorschach and several other tests. 


387 


388 


According to psychoanalytic theory, per- 
cepts and thoughts which deviate from usual, 
realistic experience with the world are po- 
tentially anxiety arousing (Freud, 1924). If 
the unusual or unrealistic experience is in- 
tegrated with more logical, coherent, and 
realistic thought, anxiety does not occur 
(Kris, 1952). The integration of unrealistic 
experiences with more rational thought makes 
coherent what otherwise might have been an 
anxiety-arousing experience, and allows the 
person to communicate his experience to 
others. 

Some people, in fact, seem to experience 
a sense of accomplishment and mastery in 
being able to engage in integrated unrealistic 
experiences (Kris, 1952). This kind of person 
enjoys and seeks unrealistic experiences, and 
should show high TUE. At the other extreme, 
the person who cannot integrate unrealistic 
experiences would be anxious about such ex- 
periences. For this reason he would strive for 
only realistic, logical, and coherent experi- 
ences and, therefore, where reality is clearly 
known, he should maintain realistic thoughts 
and percepts. Since the reality of the situa- 
tion is made clear and explicit in the TUE 
tasks, he would be expected to show low 
TUE. Avoidance of unrealistic ideation is 
not always possible for this person and, par- 
ticularly, when reality is not clearly struc- 
tured (as in projective tests), unintegrated 
and unrealistic experiences would be expected 
to appear. Between these two extremes is the 
person whose integration of unrealistic ex- 
periences is adequate, but who does not enjoy 
or seek such experiences. He should show 
moderate TUE. 

The first purpose of the present study is 
to test the theoretical expectations that people 
who engage in large amounts of well in- 
tegrated unrealistic thinking would show high 
TUE, while people who engage in large 
amounts of poorly integrated unrealistic 
thinking would show low TUE. Moderate 
TUE was expected in people with small 
amounts of unrealistic thinking. These ex- 
pectations will be specified in more opera- 
tional terms when these tasks are discussed in 
the Method section. 

The second purpose of the present study 
is to see whether individual differences in 
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TUE relate to differences in the handling of 
drives. One of the earliest and most important 
conflicts faced by the individual is that be- 
tween the expression of drives (e.g., sexual 
or aggressive impulses) and restrictions 
against such expression (Fenichel, 1945), A 
central concept in psychoanalytic theory is 
that the way a person deals with drive 
conflict determines general modes of resolving 
other conflicts (Gill, 1959). This concept was 
tested in the present study by investigating 
whether the individual’s mode of handling 
the conflict between drives and restrictions 
related to the manner in which he dealt with 
the conflict aroused by unrealistic perceptual 
experiences (TUE). 

Using the same rationale as that presented 
for unrealistic thinking, the individual who 
can integrate drive-related thinking can en- 
joy and seek situations which contain drive- 
related material, He should also tend to enjoy 
and seek unusual and unrealistic perceptions 
(high TUE). The person who cannot in- 
tegrate drive-related thinking should try to 
avoid it and should also avoid the conflict 
aroused by external events that contradict 
usual modes of perception (low TUE). Con- 
trol of drive-related thinking is not always 
possible for this person, however, and, par- 
ticularly in unstructured situations (eg. 
projective tests), unintegrated drive-related 
thinking would be expected to appear. 

The expectations were that people who en- 
gage in large amounts of well-integrated drive- 
related thinking would show high TUE, while 
people who engage in large amounts of poorly 
integrated drive-related thinking would show 
low TUE. Moderate TUE was expected in 
people with small amounts of drive-related 
thinking. 

METHOD 

Obtained from the university employment service, 
20 male graduate students served as paid Ss in this 
experiment, The Ss were seen individually in four 
sessions, with total testing time approximately 12 
hours. In the first session a test of problem solving 
was administered; these data do not pertain to the 
present study, In the second session the Rorscha 
test was given and later scored by an e 
who had no knowledge of S’s performance oP bs 
other tasks. In the third session perceptual Ee 
designed to measure TUE were administered. 


Ss took Wild’s word-association test (1965) #0% 
in the fourth session, an art-preference test. 


TOLERANCE FOR UNREALISTIC EXPERIENCES 


Measurement of TUE 


Four tasks were used to assess TUE; the first 
three to be described were used previously in stud- 
ies of TUE (Hamilton, 1957; Klein, Gardner, & 
Schlesinger, 1962). The fourth, a modification of a 
technique suggested by Haber,? has not been used 
in previous studies, but was included here because 
of its relevance to the concept of TUE. 

Phi phenomenon. In this task S was shown two 
stimuli and told that they would be flashed in 
sequence. The S’s ability to see one object moving, 
despite his knowledge that two stationary figures 
were being presented, was one measure that was 
used to assess TUE. 

Five series were run with each of two pairs of 
stimuli, In each series the experimenter gradually 
decreased the separation time between the two 
flashes of the stimulus figures until S reported 
seeing one moving figure rather than two separate 
figures. The measure of TUE was the total number 
of times S reported movement for each of the two 
pairs of stimulus figures. 

Reversible figures. The typical reversible figure 
can be seen in either of two phases. With steady 
viewing, these phases alternate. The number of 
alternations seen in a fixed time is an appropriate 
measure of TUE, because it involves seeing two dif- 
ferent forms in the same stimulus. In addition to 
number of reversals, a second measure of TUE, 
obtained from one of the reversible figures, was 
the percentage of time S viewed the figure in the 
phase which conformed less to everyday experiences. 

The first figure was a 2 X 4 inch Schroeder stair- 
case, in which one phase (a staircase coming from 
the ceiling) conformed less to everyday experience 
than the other phase (an upright staircase). The 
second figure, a “double-cross,” was three inches 
in diameter, with four black and four white portions. 

There were two 30-second trials with each figure, 
in which the S viewed the figure steadily and in- 
dicated reversals by depressing or raising a tele- 
graph key that was connected to an electric timer. 
Scores were: (a) the number of reversals for the 
cross, (6) the number of reversals for the stairs, 
and (c) the unusual-phase time for the stairs. 

Aniseikonic lenses. The Ss were given aniseikonic 
lenses to wear. Viewed through these lenses, the 
room appears distorted to most people. There are 
wide individual differences in the speed with which 
Ss recognize distortion and in the amount of dis- 
tortion observed, and these seem to be appropriate 
measures of TUE. Tolerant Ss should recognize the 
distortion quickly and see a large amount of dis- 
tortion. Intolerant Ss should maintain the customary 
shape of the room, possibly by attending to mono- 
cular cues that help maintain it. 

Two percent meridional sized lenses were used, 
with axes positioned obliquely at 45 degrees in the 
right eye and 135 degrees in the left. In this posi- 
tion the Ienses make the floor of the room appear to 
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slant down from the S and the wall in front to 
slant away and to assume a trapazoidal shape, wide 
at the top and narrow at the bottom. The S was 
asked to look around the room and describe in 
detail any differences from the usual appearance 
of the room. After 5 minutes the S was asked to 
adjust a black wooden rod to the vertical, with 
angle of deviation from the true vertical serving as 
the measure of amount of distortion, 

Scores for the aniseikonic lenses were: (a) speed 
with which distortion was recognized and (b) average 
angle of deviation of the rod from the true vertical. 

Stimulus incongruity. In this task pictures were 
employed in which a part of the scene was in- 
congruous with respect to the total context of the 
picture. An S’s ability to recognize the unusual, 
incongruous element seemed an appropriate measure 
of TUE. 

Pictures were flashed tachistoscopically, with ex- 
posure time increasing on every trial until $ recog- 
nized the incongruity in the picture. Two pictures 
were used, One showed two basketball players with 
one of the players dribbling a head, superimposed 
in a circle, in place of a basketball. The second 
picture showed a dock scene with men loading and 
unloading cargo. The incongruous element was a 
girl on the dock dressed in a striped prisoner’s 
outfit. She was lying on her back with her feet 
in the air, doing “bicycle peddling” exercises, 

The datum used for each of the two pictures was 
the speed (reciprocal of the number of seconds) with 
which the incongruous item was recognized, 

Total measure of TUE, All nine measures of TUE 
(two each for phi, lenses, and stimulus incongruity ; 
three for reversible figures) were converted to stand- 
ard scores. The total measure of TUE was the 
average of these nine scores. The average inter- 
correlation between the nine tasks that comprise the 
total measure of TUE was .153, Using this value, 
the reliability coefficient of the total measure of 
TUE, as computed by the Spearman-Brown proph- 
ecy formula, was .62. 


Measurement of Amount and Integration of 
Unrealistic Thinking 


Rorschach test, The Rorschach was administered 
with the Rapaport, Schafer, and Gill (1946) in- 
structions, by a clinical psychologist who had no 
knowledge of S’s performance on any other task, 
Test sessions were tape-recorded and verbatim pro- 
tocols were typed from the tapes, All Rorschach 
data presented are based on the scoring of the in- 
dependent psychologist who administered the test. 

Holt’s (1963) system of Rorschach analysis was 
used to provide measures of amount and integration 
of unrealistic thinking. The manual allows measure- 
ment of deviations in Rorschach responses from 
logical, orderly thinking grounded in experience with 
the real world. Using Holt’s manual, each response 
was rated for the amount of unrealistic thinking 
contained in it, and the degree to which the un- 
realistic thinking in it was integrated with more 
logical, realistic thinking. 
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Amount ratings (A) ranged from O (a perfectly 
logical and realistic response) to 5 (a response with 
blatant, nonlogical, or unrealistic elements). The 
summary score was the sum of all amount ratings 
divided by the total number of Rorschach responses 
(2 A/R). 

Integration (I) ratings (Holt’s “defense effective- 
ness”) measured the effectiveness with which S made 
the unrealistic aspect in a response a more under- 
standable and acceptable communication. It was a 
global rating based on: (a) the form level (FL) of 
the response; (b) expressive behavior which re- 
flected anxiety or enjoyment about the response; 
(c) the degree to which the response was given in 
a cultural, aesthetic, intellectual, humorous, or other 
socially acceptable context; (d) indications of de- 
fensiveness, evasion, or disruption, various kinds of 
rationalizations, failure to give expected features 
(e.g, movement or color), and slight changes in FL 
when the unrealistic aspect of the response was 
introduced, 

Integration ratings ranged from +2 (completely 
successful integration) to —2 (unsuccessful integra- 
tion). The summary score was the sum of all 
integration ratings divided by the number of such 
ratings. (Only responses having unrealistic thinking 
were scored for integration, hence the summary score 
was B I/Ru, where Ru is the number of responses 
showing unrealistic thinking.) 

From the amount and integration ratings a com- 
bined measure (Holt’s “adaptive regression” score) 
was computed to assess amount of integrated un- 
realistic thinking. This measure involved multiplying 
the amount (ranging from 1 to 5) score by the 
integration (ranging from —2 to +2) score in each 
response, and dividing the sum of these products by 
the number of responses in the record [2(A X 
I)/R]. High scores in the combined measure indi- 
cate extensive unrealistic thinking which is well 
integrated; low scores indicate extensive unrealistic 
thinking which is poorly integrated. Intermediate 
scores indicate a smaller amount of unrealistic think- 
ing, with good integration receiving higher scores 
than poor. 

The theoretical expectation was that combined 
scores would relate positively to TUE. No predic- 
tion was made for amount or integration scores 
taken alone. The amount measure does not dis- 
tinguish the S with high amount and good integra- 
tion (high TUE expected) from the S with high 
amount and poor integration (low TUE expected). 
The integration measure does not distinguish good 
integration with low amount (moderate TUE ex- 
pected) from good integration with high amount 
(high TUE expected). 

Art-Preference test. The second test that was used 
to assess ability to engage in integrated unrealistic 
experiences was a preference test for paintings with 
unrealistic content. Fifteen abstract (nonrepresenta- 
tional) paintings, selected to measure preference for 
forms which do not represent real objects, and 15 
fantastic paintings, selected to measure preference 
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for views of the environment which could not really 
exist (e.g., Edgar Ende’s “Midi,” showing a man 
being paddled in a canoe through the air), were 
paired with works that contained neither abstract 
nor fantastic elements. All paintings were previously 
employed by Child (1962) in a study of artistic 
preferences. In that study it was found that abstract 
and fantastic paintings were generally preferred less 
than paintings in six other categories. In the present 
study, in order to get maximum range in the scales, 
five of the least preferred paintings in each of the 
six other categories used by Child were randomly 
paired with the abstract and fantastic paintings, 

These 30 pairs were interspersed with 30 other 
pairs of paintings which were selected to measure 
preferences for drive-related paintings (to be de- 
scribed in the next section). In each pair, the S was 
to indicate which picture he liked better. It was 
assumed that preferences for abstract and for fan- 
tastic works indicated that S’s unrealistic experiences 
were well integrated, because he was able to enjoy 
such experiences. Scores were the number of abstract 
and the number of fantastic paintings preferred. In 
order to get a total measure of preference for un- 
realistic paintings, these two scores were converted 
to standard scores and averaged. 

The theoretical expectation was that preference 
for forfms which do not represent real objects (ab- 
stract paintings) and for views of the environment 
which could not really exist (fantastic paintings) 
would relate positively to TUE. 

Word-Association test. Capacity to shift from un- 
usual to common associations when instructed to do 
so was a third measure used to assess unusual modes 
of thinking (see Wild, 1965). Three lists of 30 words 
each were employed. Following Wild’s procedure 
the test was first given with the usual Rapaport 
instructions (“spontaneous condition”). The test was 
then administered using instructions designed to 
elicit unusual associations (“unconventional condi- 
tion”); this was followed by administration with 
instructions designed to elicit common associations 
(“conventional condition”). The instructions de- 
signed to elicit unusual or common associations COn- 
sisted of character sketches of an unconventional, 
original person on the one hand, and a highly con- 
ventional person on the other. The S was asked to 
take the tests as the characters in the sketches 
would. Associations were scored common if they 
appeared in standard norms; all other associations 
were considered unusual. The shift score was, the 
number of unusual associations given in the ‘un 
conventional condition” minus the number given a 
the “conventional condition.” High scores were ob- 
tained by Ss who gave many unusual associations in 
the “unconventional condition” and few in the 
“conventional condition.” The shift score thus met- 
sured a capacity to engage in both unusual and in 
common modes of thought, depending on the de- 
mands of the situation. 

The theoretical expectation is that shift 
would relate positively to TUE. 


score 
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Measurement of Amount and Integration of 
Drive-Related Thinking 


Rorschach test. In addition to measuring unreal- 
istic thinking, Holt’s (1963) scoring system also 
allows measurement of drive-related responses where 
no unrealistic aspects are apparent. Included are re- 
sponses containing oral, anal, sexual, exhibitionistic- 
voyeuristic, homosexual, and aggressive images. 
Exactly the same Rorschach measures (amount, in- 
tegration, and combined score) were used to assess 
drive-related responses as were used for unrealistic 
responses. 

The theoretical expectation was that scores on the 
combined measures would relate positively with 
TUE. As with the Rorschach scores for unrealistic 
thinking, no predictions were made for amount and 
integration scores taken alone. 

Art-Preference test. Ten paintings with sexual 
themes (mostly nudes), 10 with aggressive themes 
(mostly people fighting), and 10 with oral themes 
(mostly people eating) were paired with 30 neutral 
paintings and were interspersed with the 30 other 
pairs selected to measure preference for unrealistic 
paintings. The aggressive, sexual, and oral paintings 
were selected from six categories of paintings used 
by Child (1962)—the abstract and fantastic paint- 
ings were not used for this part. Each drive-related 
painting was paired with a painting in the same 
category that was judged about equally preferable 
in a previous sample of Ss. For example, a nude 
(sexual content) ranked fifteenth out of 60 in the 
single women category would be paired with the 
painting of a woman ranked fourteenth in that 
category, 

In each pair S was to indicate which picture he 
liked better. It is assumed that preferences for works 
with sexual, aggressive, or oral themes indicate that 
S’s drive-related experiences are well integrated be- 
cause he can enjoy such experiences. Scores were 
the number of sexual, the number of aggressive, and 
the number of oral paintings preferred. In order to 
get a total measure of preference for drive-related 
paintings, these three scores were converted to stan- 
dard scores and averaged. 

Theoretical expectations are that sexual, aggressive, 
and oral preferences should relate positively to TUE. 


RESULTS 

TUE and Unrealistic Thinking 

Rorschach test, As indicated in Table 1, 
the theoretical expectation that scores on the 
combined measure (assessing amount of in- 
tegrated unrealistic thinking) would relate to 
total TUE was confirmed. As can be seen in 
Table 1, for the combined score the correla- 
tion with total TUE was .49 (p< .025). 
Unless otherwise noted all p values reported 
are for one-tailed tests. 

There was no theoretical expectation that 


TABLE 1 


CORRELATIONS BETWEEN ToTaL TUE AND MEASURES 
OF UNREALISTIC THINKING 


Measure Correlation 

Rorschach test 

Amount -08 

Integration .46**a 

Combined score Agere 

R —.06 

Ru 14 
Art-Preference test 

Total .22 

Abstract =i 

Fantastic 40 
Word-Association test 

Spontaneous .02 

Unconventional Ey aaia 

Conventional —.26 

Shift ogere 


Note,—Abbreviated: R = Total number of Rorschach 
responses; Ru = number of Rorschach responses showing 
unrealistic thinking. 

s PPO test, 


amount or integration, taken alone, would 
relate to TUE. Results in Table 1 indicate, 
however, that integration correlated signifi- 
cantly with TUE (r=.46, p< .05, two- 
tailed). 

It should be noted that no significant cor- 
relations were found between TUE and total 
number of Rorschach responses (R) or num- 
ber of responses with unrealistic material 
(Ra). The negligible correlations indicated 
that the findings described above were not a 
function of the use of R and R, as denomi- 
nators in computation of the Rorschach sum- 
mary score ratios. 

Art-Preference test for unrealistic paint- 
ings. As can be seen in Table 1, no con- 
firmation was found for the expectation that 
preference for abstract art would relate to 
total TUE (r= —.11). Confirmation was 
found for the expectation that preference for 
fantastic art would relate to total TUE (r 
= 49, p < .025). 

Word-Association test. As shown in Table 
1, the expectation that the shift score would 
relate to total TUE was confirmed (r= .62, 
$ < .01). Also, a significant positive correla- 
tion (r= .52, p< .01) was found between 
the number of unusual associations given in 
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the “unconventional condition” and total 
TUE. As expected, the correlation between 
the number of unusual associations given in 
the “conventional condition” and total TUE 
was negative, but the relationship was not 
significant (r = —.26). No appreciable cor- 
relation (r= .02) was found between the 
number of unusual associations given in the 
“spontaneous condition” and TUE. 

Total measure of amount of integrated un- 
realistic thinking. Total measure of amount 
of integrated unrealistic thinking was calcu- 
lated by combining measures on the Ror- 
schach, art-preference, and word-association 
tests. To this end, the combined score from 
the Rorschach, total score for preference for 
unrealistic paintings, and the shift score from 
the word-association test were converted to 
standard scores, and the average of these 
three measures was used as the total measure 
of amount of integrated unrealistic thinking. 
The correlation of this total measure with 
total TUE and with each of the nine indi- 
vidual tasks is shown in Table 2. 

The expectation that the total measure of 
amount of integrated unrealistic thinking 
would relate to total TUE was confirmed (r 
= .66, p < .01). Significant correlations (p 
<.05) were found with horse movement, 
number of cross reversals, lens speed, and 
incongruity in the “basketball” picture. Cor- 
relations bordering on significance (p < .10) 


TABLE 2 


CORRELATIONS BETWEEN TOTAL MEASURE OF AMOUNT 
OF INTEGRATED UNREALISTIC THINKING 
AND THE TUE MEASURES 


TUE measure Correlation 
Horse movement Al** 
Square movement .32* 
No. reversals, cross 49st 
No. reversals, stairs .07 
Unusual-phase time, stairs 05 
Lens speed -40** 
Lens angle Eita 
Incongruity, basketball Aree 
Incongruity, girl .35* 
Total n yanad 
*p < 10 
Dp < 05. 
wp < 025 
wre D <0 
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TABLE 3 


CORRELATIONS BETWEEN ToTAL TUE AND MEASURES 
OF DRIVE-RELATED THINKING 


Measure Correlation 

Rorschach test 

Amount 17 

Integration 20 

Combined score 44** 

13 

Art-Preference test 

Total 70**F* 

Aggressive A8** 

Sexual .34* 

Oral .23 


Note.—Abbreviated: Ra = number of Rorschach re- 
sponses with drive-related material, 


were found for square movement, lens angle, 
and incongruity in the “girl” picture. No sig- 
nificant correlations were found for number 
of reversals or unusual-phase time for the 
staircase figure. 


TUE and Drive-Related Thinking 


Rorschach test. Correlations of TUE with 
Rorschach measures are shown in Table 3. 
As expected, the combined score for drive- 
related thinking derived from the Rorschach 
related significantly with total TUE. There 
was no theoretical expectation that amount or 
integration, taken alone, would be related to 
TUE, and no significant relationships were 
found for these two scores. 

As previously reported, no significant cor- 
relation was found between TUE and total 
number of Rorschach responses (7 = —-06): 
Table 3 indicates that TUE did not relate 
significantly with number of responses with 
drive-related material (Ra) either (7 = -13). 
These negligible correlations indicate that the 
findings described above were not a function 
of the use of R and Ra as denominators 1 
computation of the Rorschach summary score 
ratios. $ 

Art-Preference test for drive-related paint- 
ings. Table 3 indicates that total preference 
for drive-related paintings related signifi- 
cantly with total TUE. Preference for ag- 
gressive paintings correlated significantly 
with total TUE (r=.48, p< -05), a 


— 
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TABLE 4 


CORRELATIONS BETWEEN TOTAL MEASURE OF AMOUNT 
OF INTEGRATED DRIvE-RELATED THINKING 


AND TUE MEASURES 
TUE measure Correlation 
Horse movement .43%* 
Square movement .38** 
No. reversals, cross 38** 
No. reversals, stairs .32* 
Unusual-phase time, stairs .27 
Lens speed .09 
Lens angle 24 
Incongruity, basketball 46*** 
Incongruity, girl eae 
Total aces 
*p <.10. 
+p < 05. 
# pi< 025, 
wert o << .01, 


preference for sexual paintings tended to re- 
late to TUE (r= .34, p < 10). Preference 
for oral paintings did not correlate signifi- 
cantly with total TUE. 

Total measure of amount of integrated 
drive-related thinking. Total measure of 
amount of integrated drive-related thinking 
was calculated by combining total score for 
preference for drive-related paintings with 
the combined score of the Rorschach test. 
Table 4 shows the correlation of this total 
measure with total TUE and with the nine 
individual TUE scores. 

The expectation that the total measure of 
amount of integrated drive-related thinking 
would relate to TUE was confirmed (r = .71, 
$ < .01). Significant correlations (p < .05) 
were found with horse and square movement, 
number of cross reversals, and both incongru- 
ous pictures. Correlations bordering on sig- 
nificance (p < .10) were found for number of 
stair reversals. No significant correlations 
were found for unusual-phase time for the 
stairs or for lens speed or lens angle. 


DISCUSSION 


The diversity of tasks which related to the 
perceptual measures of TUE supports the 
hypothesis suggested by Gardner, Holz- 
man, Klein, Linton, and Spence (1959) that 
TUE represents a predictable style of be- 
havior that determines a person’s responses 
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in a wide variety of situations. The word- 
association test is particularly relevant, for 
it adds further evidence to findings by Kap- 
lan (1952) and Martin (1954) that tolerance 
for unrealistic experiences has relevance in 
tasks which do not involve visual perception. 

High TUE was found in the individual 
who was generally able to integrate large 
amounts of both unrealistic and drive-related 
material with more logical and neutral think- 
ing. The overall picture was of a person who 
relaxed controls enough to enjoy unrealistic 
and drive-related experiences, but who main- 
tained a degree of integration of such ex- 
periences by combining them with logical, 
orderly, and socially accepted thinking. 

Lowest TUE scores were found in the in- 
dividual whose thinking was often filled with 
unrealistic and drive-related thoughts and 
images, but who was unable to integrate such 
material with more conventional or neutral 
thoughts. It would appear that low TUE 
represented an effort to maintain simple or- 
derly experiences in a person who tended to 
become confused and disturbed by unrealistic 
and drive-related thoughts. 

A number of other studies have investi- 
gated concepts similar to TUE. Frenkel- 
Brunswik’s (1949) work with tolerance of 
ambiguity, Barron’s (1952) work with pref- 
erence for complexity, Berlyne’s (1960) in- 
vestigations of response to novelty, and Wit- 
kin, Lewis, Hertzman, Machover, Meissner, 
and Wapner’s (1954) work on space orienta- 
tion all seem to be tapping traits similar to 
TUE. One characteristic that these concepts 
have in common is that they all suggest that 
the subject’s difficulties facing conflictual or 
complex perceptual situations are related to 
anxiety coming from sexual, aggressive, and 
other potentially antisocial impulses—the 
person with more conflict about drives is 
assumed to have a greater tendency to avoid 
all conflictual experiences. 

The findings in the present study indicated 
that a failure to achieve integration of drive- 
related thought with more neutral thought 
correlated with an avoidance of the conflictual 
situations involved in the TUE tasks. These 
results correspond to the Witkin et al. (1954) 
finding that people who are more field-de- 
pendent have relatively poor control over 
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aggressive and sexual impulses. The findings 
of Witkin and his co-workers and those of 
the present study thus indicate that people 
who can integrate drive-related thought seem 
generally more open to internal sensations, 
with less dependence on usual experiences in 
the external environment. 

This capacity to resist overdependence on 
usual experiences seems to be related to con- 
cepts of creativity and rigidity, and the rela- 
tionships between the present findings and 
these areas are worthy of discussion. Theo- 
retical considerations suggest that TUE prob- 
ably plays an important role in the creative 
process. The ability to make original contri- 
butions seems to require an openness to new 
relationships and an ability to see beyond 
the constraints of conventional labels and 
anticipations. Schactel (1959), for example, 
states that creativity requires the ability to 
see beyond the “preformed cliches and angles 
as make up the world of ‘reality’ seen by 
society.” Similarly Bellak (1958) comments 
that the creative process requires a weakening 
of the “sharply defined boundaries of figure 
and ground, of logical, temporal, spatial, and 
other relations.” 

The results of the present study indicate 
that the ability to engage in integrated un- 
realistic and in integrated drive-related think- 
ing relates to TUE. Insofar as TUE seems to 
be a necessary aspect of the creative process, 
the results tentatively suggest that these same 
abilities (ie, to engage in integrated un- 
realistic and in integrated drive-related think- 
ing) might also be involved in creativity. 
Such an interpretation is consistent with pre- 
vious findings by Wild (1965) and by Pine 
and Holt (1960). These studies, taken to- 
gether, lend support to the psychoanalytic 
concept of creativity (Kris, 1952) as involv- 
ing the integration of primitive, drive-related, 
nonlogical modes of thinking with more neu- 
tral, reality-oriented, logical thoughts. 

It would be useful in future research to 
investigate whether people in fields assumed 
to require creative ability (e.g., art) show 
higher TUE than control groups selected 
from the general population. Similarly, will 
TUE distinguish “more creative” from “less 
creative” people within each field? The psy- 
choanalytic theory of creativity (Kris, 1952) 
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suggests that the capacity to perceive objects 
in unusual ways is important in almost all 
creative fields. A question of interest is 
whether TUE will relate differentially in 
fields which seem to require different degrees 
of dependence on visual perception. For ex- 
ample, will TUE differentiate more from less 
creative individuals in art but not in chem- 
istry? 

Low TUE also seems to be an aspect of 
rigidity when rigidity is defined, as by Cattell 
and Tiner (1949)—the ease with which old 
established habits may be changed in the 
presence of new demands. The aniseikonic 
lenses, for example, require relinquishing the 
habit of seeing rooms as rectangular and 
floors and ceilings as level. 

Although there is a great amount of work 
in the area of rigidity, few empirical studies 
have been done which shed much light on 
the nature of underlying personality factors 
which lead to rigid behavior. An exception is 
work done by researchers interested in “The 
Authoritarian Personality” (Adorno, Frenkel- 
Brunswik, Levinson, & Sanford, 1950). 
Workers in this area point out that a wide 
range of rigid behavior (e.g., as expressed 
in racial prejudice, rigidity in problem solv- 
ing, intolerance of ambiguity, etc.) consti- 
tutes a “counter-balance to underlying con- 
flicts often verging on chaos” (Frenkel- 
Brunswik, 1949). 

The concept of “chaotic” underlying con- 
flict was assessed in the present study by 
measures of amount of poorly integrated un- 
realistic and poorly integrated drive-related 
thinking. The low TUE subject was unable to 
organize fantasy and drive-related thoughts, 
and it appeared that in order to defend 
against these disorganized internal exper 
ences the low TUE subject maintained 4 
rigid, conventional, “safe” mode of respond- 
ing in situations where socially acceptable, 
conventional responses were well delineate r 
This supports the concept that rigidity can 
represent defenses against disorganized 1- 
ternal experiences. 

Before results from the present study could 
be applied to other types of rigidity it wo f 
be necessary to illustrate that the types T 
perceptual rigidity reflected in low TUE a 
related to rigidity in problem solving, 
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social values, and so on. An indication that 
there is some generality of TUE to other 
measures of rigidity is found in Hastorf’s 
(1959) report of a significant positive cor- 
relation between time to recognize aniseikonic 
lens distortion and time to solve the extinc- 
tion problem in Luchin’s Einstellung test. 

Finally, it is worthwhile to return to a 
theoretical issue presented in the introduc- 
tion. According to psychoanalytic theory, one 
of the earliest and most important conflicts 
faced by the individual is that between the 
expression of drives and restrictions against 
such expression. A central concept in psycho- 
analysis is that modes of dealing with this 
early drive conflict will determine the way 
the individual deals with later conflictual sit- 
uations. The results of the present study in- 
dicated that a failure to achieve integration 
of drive-related thinking was associated with 
an avoidance of the conflictual situations in- 
volved in the TUE tasks. These results are 
consistent with this general concept from psy- 
choanalytic theory. The present study, of 
course, provides no evidence as to how the 
TUE mode of perceiving and the manner of 
dealing with drive conflicts are related to 
each other in the course of psychological 
development. Such information can only be 
obtained from future longitudinal studies 
which would relate impulse control and TUE 
at various developmental stages. 
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DEFENSIVENESS AND NEED FOR APPROVAL 


LAWRENCE A. NEWBERRY 1 


Purdue University r 


Need for approval (nApp) and expectancy (Exp) of approval vs. disapproval 
were manipulated under high- and low-consequences (Con) conditions (in a 
3X2X2 ANOVA design) to determine their relationship to defensiveness 
(Def). The 3 nApp levels were obtained by trichotomizing scores on the 
Marlowe-Crowne Social Desirability scale; Exp was manipulated by provid- 
ing Ss with either positive or negative verbal reinforcement during an inter- 
view; Con consisted of E posing as a threatening authority figure to 30 Ss 
and as a student to the remaining 30 (N=60 undergraduates). 3 separate 
dependent-variable measures were used: the K Scale, Gough’s Def scale, and 
the Rotter Incomplete Sentences Blank. It was predicted that Def would in- 
crease as a function of nApp, Exp of disapproval, and high Con. However, Ss 
became more defensive only under the high Con condition (p<.05), and 
nApp was related to Def only under Exp of approval conditions (p < .05). 


In the past few years, several studies have 
suggested that defensiveness predisposes 
clients to terminate individual psychotherapy 
prematurely. (E.g., see Hiler, 1959; Mc- 
Nair, Lorr, & Callahan, 1963; Taulbee, 
1958.) Also, Garfield and Affleck (1961) 
found that therapists express extremely nega- 
tive feelings toward defensiveness in clients. 
Thus, defensiveness may also affect thera- 
peutic outcome through its effects on thera- 
pists’ attitudes, as well as exerting an effect 
on termination of therapy. 

These findings imply that defensiveness is 
a major barrier to successful psychotherapy, 
or even continuation of therapy. However, the 
variables which might predispose the client 
to defensiveness have been less thoroughly 
studied. The most obvious exceptions have 
been the theoretical paper by Hogan (1952) 
and the experimental studies by Strickland 
and Crowne (1963) and Lamb and Fretz 
(1964). Both experimental studies implicated 
need for approval (measured by the Marlowe- 
Crowne Social Desirability scale) as a con- 
tributor to defensiveness. 

Strickland and Crowne (1963) found that 
patients with high Marlow-Crowne Social 
Desirability (M-C SD) scores (i.e., high 
need for approval) tended to terminate psy- 
chotherapy prematurely and that therapists 
rated these patients as being more defensive 
and less improved than patients with lower 
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M-C SD scores. Lamb and Fretz (1964) 
replicated the finding regarding premature 
termination of therapy. In addition, by ana- 
lyzing the content of the therapeutic inter- 
views given to their subjects, Lamb and 
Fretz suggested that patients with high M-C 
SD scores were more defensive than patients 
with low M-C SD scores. 

The present study was designed as an 
extension of Strickland and Crowne’s (1963) 
findings regarding the relation of need for 
approval (nApp) to defensiveness. In Rotter’s 
(1954) model, from which the M-C SD 
scale evolved, social desirability could be 
viewed as a measure of the “behavior po- 
tential” of seeking approval. Whether differ- 
ences in strength of approval seeking (i.e; 
M-C SD scores) reflect differences in need 
(reinforcement value) for approval is not 
clear. It is equally possible that these differ- 
ences are mediated by variations in expect- 
ancy of approval and/or minimal goal (i.e 
how much approval is sought, rather than how 
important it is). $ 

Since the M-C SD scale cannot provide 
independent measures of need, expectancy, 
and minimal goal, other operations are re- 
quired to separate these variables. In the 
present study, need for approval and expect- 
ancy of approval were experimentally manip- 
ulated in an attempt to discover which vati 
able is most clearly related to defensiveness. 
Inclusion of the M-C SD scale was intende 
to provide a means of determining any POS- 
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sible interactions involving individual dif- 
ferences in approval seeking. 

The purpose of the study was to further 
explore the antecedents of defensiveness in 
an analogue to psychotherapy. The primary 
hypotheses were that defensiveness is directly 
related to (a) need for approval as measured 
by the M-C SD scale, (6) an experimentally 
manipulated expectancy of disapproval, and 
(c) the objective consequences of disapproval, 
or need for approval as experimentally 
manipulated. 


METHOD 


The Ss were 29 male and 31 female introductory 
psychology students at Purdue University, aged 
17-25. All Ss, with one exception (a senior), were 
freshmen and sophomores. A pilot study (N = 18) 
and a separate, independent sample (N= 75) of 
introductory psychology students showed no differ- 
ences between the sexes on the dependent-variable 
measures used; consequently, Ss were randomly as- 
signed to the four experimental conditions without 
regard to sex. 


Independent Variables 


Need for approval. The M-C SD scale was used 
to measure need for approval. Within each of the 
four experimental conditions, Ss were trichotomized 
as a function of their scores on this scale. Divisions 
were arbitrarily made at one probable error above 
and below the total group mean (scores of 18 or 
above for high nApp Ss, and 11 or below for low 
nApp Ss). 

Expectancy of approval versus disapproval. A 
second independent variable involved a manipulation 
of the Ss’ expectancy of receiving approval or dis- 
approval from E. The variable was manipulated 
in a “clinical interview” analogue, during which all 
responses given by Ss were followed either by an 
appropriate positive reinforcement which indicated 
. ’s approval or by a negative reinforcement which 
indicated his disapproval. Reinforcements were 
Primarily verbal, 

Consequences of disapproval. The third independ- 
ent variable was the implied consequences of the 
clinical interview. In the high-consequences con- 
dition, E presented himself as a research psychologist 
from the National Institute of Mental Health who 
Was studying the incidence of mental illness on 
College campuses. All Ss were told that a previous 
Study had shown that the incidence of mental illness 
among college students was more than twice as high 
as that of any other population group in the United 
States. As a result, the federal government had 
instituted a study in which several psychologists 
Were to test and interview various samples of 
students throughout the nation in an effort to dis- 
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cover some of the reasons for the high number of 
mental aberrations among college students, In addi- 
tion, Ss were told that in order to obtain permission 
to use Purdue students in the study, the researchers 
had been required by Purdue’s office of the president 
to report any students who showed signs of mental 
aberration. As a result, the researchers were unable 
to assure Ss that the test and interview material 
elicited from them would be confidential. 

In the low-consequences condition, E presented 
himself as a goof-off graduate student who was 
conducting a study relating to personality variables 
of college students, The E made it clear that he was 
performing the research only because it was required 
of him in his graduate program. 


Dependent-Variable Measures 


Defensiveness. The MMPI K scale (Dahlstrom & 
Welsh, 1960) and the Defensive scale (ACL-Df) 
from the Gough Adjective Check List (Gough & 
Heilbrun, 1965) were used as independent measures 
of defensiveness, 

Maladjustment. The Rotter Incomplete Sentences 
Blank (ISB) was given to all Ss as a measure of 
maladjustment (Rotter & Rafferty, 1950). This test 
was included to determine if defensive Ss would 
tend to “fake good” (i.e present themselves in a 
favorable light) on a free-response test of malad- 
justment. 


Procedure 


All Ss were asked to complete the M-C SD scale 
and a short biographical information sheet prior to 
any of the experimental manipulations. To avoid 
arousing suspicion, Ss were told that E had to leave 
the building for a few minutes and that he was 
giving them a short questionnaire to complete so 
their time would not be wasted in his absence. 
Postexperimental interviews indicated that none of 
the Ss had questioned this explanation, 

Following the administration of the M-C SD scale, 
all Ss were interviewed for approximately 30 
minutes. As closely as experimental conditions per- 
mitted, the interviews followed the analogue of a 
clinical intake interview. The interviews followed 
a structured outline which was identical for all 
Ss. In general, Ss were asked about current life 
problems, sources of frustration, moral and religious 
standards, and so on. Prior to conducting the inter- 
view, E had presented himself either as a goof-off 
graduate student or as a research psychologist (high- 
and low-consequences conditions). During the inter- 
view, Æ manifested either approval or disapproval 
of the Ss’ responses (expectancy of approval versus 
disapproval conditions). All interviews were tape- 
recorded, and the microphone was placed in open 
view on the table which separated the interviewer 
from the interviewee. 

Following the interview, each S was individually 
administered the K scale, Rotter ISB, and Gough 
ACL, in that order. 
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Desensitization 


Following the experimental manipulations and 
testing, all Ss were interviewed and given full in- 
formation concerning the real purposes of the experi- 
ment, It was felt that this interview effectively 
eliminated any negative feelings that Ss might 
have developed during the experiment. In addition, 
Ss were asked to maintain the security of the 
project. All Ss agreed to this, and there was no 
indication that security was violated during the 
period of the study. Finally, all Ss were assured 
that the information elicited from them during the 
interview and from the psychological tests would be 
completely confidential. 


Experimental Design and Analysis 


A separate 3X2X2 fixed-factor analysis of 
variance (ANOVA) was performed for each of the 
three dependent-variable measures. The trichotomiza- 
tion of the M-C SD scores within each condition 
resulted in unequal cell frequencies (4-7 Ss per 
cell) and required the use of an unweighted means 
analysis in the computation of the ANOVAs (Winer, 
1962). Homogeneity of variance in all of the 
ANOVAs was demonstrated by applications of 
Hartley's Fmax test (Winer, 1962). 


RESULTS 


The ANOVA summary in Table 1 shows 
that only the objective consequences of dis- 
approval were significantly related to defen- 
siveness. This relationship was significant 
(p < .05) for both the K scale and ACL-Df, 
and it was nearly significant (p < .10) for 
the Rotter ISB measure. Thus, only the third 
hypothesis was supported. 

Since previous evidence (Lamb & Fretz, 
1964; Strickland & Crowne, 1963) had sug- 
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gested a relation between M-C SD scores and 
defensiveness, the data were further analyzed 
to seek explanation of the absence of such a 
relationship in this study. First, an independ- 
ent sample of Ss (N = 75) was drawn from 
the same pool from which the experimental Ss 
were solicited. This sample was given the 
M-C SD scale and the K scale in a single 
group-testing situation without experimental 
manipulations of any type. The Pearson cor- 
relation between the two test measures was 
47 (p< .005), which was roughly con- 
sistent with expectations from Strickland and 
Crowne (1963) and Lamb and Fretz (1964). 

Next, correlations among all four measures 
used in the study were computed within each 
experimental condition (Table 2). This pro- 
vided what might be termed a “convergent 
validity” intercorrelation matrix within each 
condition. In Table 2, it can be seen that the 
three dependent-variable measures were 
highly correlated in the low-consequences- 
approval condition, but not in the other con- 
ditions (except for the r= —.70 between 
ISB and K scale scores in the low-conse- 
quences—disapproval condition). 

M-C SD scores were unrelated to ACL-Df 
scores and were related (r= .47) to ISB 
scores only in the high-consequences—dis- 
approval condition. M-C SD and K scale 
scores were significantly related in both ap- 
proval conditions (with correlations similar to 
those found with the independent sample), but 
not in the two disapproval conditions. The dis- 
approval conditions seemed to obscure the 


TABLE 1 
SUMMARIES OF ANALYSES OF VARIANCE OF K, ACL-Df, AnD ISB Scorrs 
K scale ISB ACL-Df 
Source df rf 

MS F MS F MS K 
nApp 2 26.6 1.71 29.7 <1 8.2 <1 
Expectancy 1 7.1 <1 84.8 1.32 4.7 <i 
Consequences 1 81.5 5.22** 235.0 3.68* 664.0 4.91" 
nApp X Exp 2 17.6 1.13 21.7 <1 16.1 <i 
nApp X Con 2 7.6 <1 15.5 <1 25.5 <1 
Exp X Con 1 2.3 <1 04 <1 13.3 <1 
nApp X Exp X Con| 2 44 <1 146.1 2.28 80.8 <1 
Error 48 15.6 63.8 135.2 

*p <.10 
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TABLE 2 


INTERCORRELATIONS OF THE M-C SD SCALE AND THE DePENDENT-VARIABLE SCORES 1N 
EACH EXPERIMENTAL CONDITION 


Low consequences 


High consequences 


Test Approval Disapproval Approval Disapproval 
K ISB ACL K ISB ACL K ISB ACL K ISB ACL 
M-C SD 46% —=.27 .09 34 10 —.19 .49* —.03 16 =.13. —.47* 0 
K —.72 85% =.70%* 25 =27 31 z2 10 


Note.—All n = 15, 
*p <.05, 


relationship between M-C SD and K scale 
Scores. Finally, it can be seen that ACL-Df 
Scores showed little evidence of reflecting any 
differences in “defensiveness,” and ISB 
maladjustment scores were inversely related 
(as expected) to M-C SD scores in the high- 
consequences—disapproval condition. 


Discussion 


Although the present study was not in- 
tended to replicate the Strickland and Crowne 
(1963) or the Lamb and Fretz (1964) stud- 
tes, certain post hoc results seem to provide 
further evidence for their conclusions, In 
both approval conditions of the present study, 
the relationship of M-C SD scores to K scale 
Scores was found. Since the disapproval con- 
ditions in the present design were scarcely 
an analogue of the ordinary behavior of a 
therapist, perhaps the relationship between 
heed for approval and defensiveness that was 
found by the forementioned authors should 
not have been expected. 

Although both the K scale and the ACL-Df 
Scale showed a significant consequences main 
effect, three of the four correlations between 
the two scales were nonsignificant (Table 2). 
Tf both (or either) of them measure a gen- 
eralized attribute that may be termed “de- 
fensiveness,” they are clearly measuring dif- 
ferent aspects of it. Since ACL-Df was not 
related to any of the other scales in this 
Study, except in the low-consequences-ap- 
Proval situation, its conceptual significance 
remains unclear. It is of interest to note that 
Gough and Heilbrun (1968) found a sim- 


ilarly low and nonsignificant relationship be- 
tween the K scale and ACL-Df. 

Within Rotter’s model ( 1954), “behavior 
potential” is regarded as a multiplicative 
function of expectancy and reinforcement 
value. According to this formulation, before 
a person will seek approval, he must expect 
that his behavior actually will lead to ap- 
proval and that the forthcoming approval 
will have some reinforcement value above his 
minimal goal. Similarly, defensiveness may be 
viewed as a function of the expectancy that 
behavior such as “presenting one’s self in a 
favorable light” will win approval, plus the 
knowledge that approval is rewarding. 

This formulation seems to fit the behavior 
seen in the present study. Either (or both) 
of two types of behavior could have been 
operating here, First, the defensiveness could 
have taken an internally oriented form, where 
the high nApp subjects would have presented 
themselves in a favorable light because they 
have a highly vulnerable self-concept which 
must be protected at all costs; they cannot 
admit their deficiencies to anyone, even to 
themselves. This appears to be the type of 
defensive person described by Strickland and 
Crowne (1963) and the type that was seen 
under the two approval conditions in this 
study. 

Second, the defensiveness could have taken 
an externally oriented form where subjects 
would have tried to please or impress the 
experimenter simply to insure that he would 
display approval. Their need to protect a 
vulnerable self-concept may have been lower 
than the need to obtain external gratification 
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of approval needs. This seems to be the type 
of behavior that subjects manifested during 
the two disapproval conditions. That is, when 
disapproval was directed against their at- 
tempts to present themselves in a favorable 
light, they eventually dropped this form of 
defensive behavior and began agreeing with 
the experimenter’s negative evaluation of 
them. This appeared to be an attempt to ob- 
tain approval from the experimenter, even at 
the cost of being unable to overtly protect 
their self-concepts. It seemed that their need 
to obtain gratification of approval needs from 
an external source was greater than their need 
to protect a vulnerable self-concept. 

It is also possible that individuals may 
choose either of the forms if the person with 
whom they are interacting provides them with 
a greater opportunity to use one form rather 
than another. In this case, the defense could 
be viewed as a reaction to the environmental 
context rather than a “built-in” attribute of 
the subjects. 

The foregoing comments are strictly specu- 
lative and represent no more than an attempt 
to explicate some of the theoretical issues 
relating to defensiveness. The only hypothesis 
that received experimental verification was 
the one which stated that the objective con- 
sequences of disapproval are directly related 
to defensiveness. There was no evidence that 
differences in strength of approval seeking 
(ie, M-C SD scores) reflect differences in 
need for, or expectancy of, approval (as 
experimentally manipulated). That is, the 
ANOVAs indicated only that all subjects, 
regardless of their measured need for ap- 
proval, become more defensive when con- 
fronted with the potentially dangerous con- 
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sequences of interacting with a threatening 
authority figure. 
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The present study was undertaken to help clarify the relationship of repression- 


sensitization to adjustment status, soci: 
sponse set. Ss were 83 hospitalized male 


al desirability, and acquiescence re- 
medical/surgical patients classified as 


adjusted and 78 hospitalized male psychiatric patients classified as maladjusted. 
The defensive style of repression occurred with significantly higher frequency 
in the adjusted population, while the defensive style of sensitization occurred 
with significantly greater frequency in the maladjusted population. There was 
a significant correlation between the repression-sensitization dimension and 
both social desirability and acquiescence response set. However, the correla- 
tions were low enough to suggest that the repression-sensitization dimension is 


tapping something not measured by the 


The repression-sensitization dimension is a 
bipolar categorization of defensive behaviors 
which is felt to characterize two different 
modes of defensive adaptation to threat. 
Byrne (1964) states: 


At (the repression) end of this continuum of 
defensive behaviors are those responses which in- 
volve avoidance of the anxiety-arousing stimulus 
and its consequents. Included here are repression, 
denial, and many types of rationalization, At the 
sensitizing extreme of the continuum are behaviors 
which involve an attempt to reduce anxiety by ap- 
Proaching or controlling the stimulus and its conse- 
quents, The latter mechanisms include intellectual- 
ization, obsessive-compulsive behaviors, and rumina- 
tive worrying [p. 169]. 


Repression and sensitization have usually 
been thought of as two different but equally 
maladaptive defensive patterns, since they 
both represent extreme and rigid modes of 
defense. However, some recent research has 
Suggested that the repressive mode may be 
Seen with greater frequency than the sensitiz- 
ing mode in a relatively well-adjusted popula- 
tion, whereas the sensitizing mode would be 


*This paper is based on portions of a dis- 
Sertation submitted in partial fulfillment of the 
Tequirements for the degree Doctor of Philosophy at 
Adelphi University, 1966, The author wishes to 
‘xpress her gratitude to Harold Levine and Norman 
Berk for their encouragement and suggestions during 
all phases of this research, Particular thanks are 
extended to George Stricker, for his unstinting 
Suidance and support, and to Bernard Locke and 
the staff of the Veterans Administration Hospital, 

‘anhattan, for their fullest cooperation and support. 
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other 2 variables. 


more characteristic of a relatively mal- 
adjusted population. Ullmann (1962) com- 
pared the scores on his facilitation-inhibition 
scale (which correlates —.94 with Byrne’s 
repression-sensitization scale) of two groups 
of male neuropsychiatric Veterans Adminis- 
tration patients (W = 90 and 64) and 47 
male college students, and found that the 
psychiatric patients fell further toward the 
facilitating, or sensitizing, end of the con- 
tinuum. Thus psychiatric patients are more 
prone to the admission of anxiety and rumina- 
tive obsessive worrying than are college 
students, 

Byrne, Golightly, and Sheffield (1965) ad- 
ministered the repression-sensitization scale 
and the California Psychological Inventory 
(CPI) to 91 students and obtained significant 
correlations on 7 of the 18 CPI variables. 
Results showed that repressors fell at or 
above the mean of the CPI standardization 
group, while sensitizers fell below the mean 
on most variables. However, Byrne et al. 
feel that the conclusions concerning the 
relationship between defense mechanisms and 
adjustment should be drawn with caution, as, 
to date, the majority of evidence which points 
to this relationship is based on paper-and- 
pencil tests of adjustment. Byrne et al. 
(1965) state: 

Since repressors by definition deny that anything 


is wrong with them and sensitizers by definition 
overemphasize their failings, perhaps we should 
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not be surprised to find them repeating this pattern 
of responses on other self-report instruments [p. 
588]. 


Before a relationship between defensive style 
and adjustment can be said to be established, 
adjustment will have to be defined by means 
other than self-report paper-and-pencil de- 
vices. 

The repression-sensitization dimension has 
been criticized by Christie and Lindauer 
(1963) on the grounds that it does not seem 
to be measuring an independent construct, 
but rather merely the need to respond to items 
in the socially desirable direction or a tend- 
ency to “naysay” or “yeasay.” They argue 
that Altrocchi, Parsons, and Dickoff’s (1960) 
and Byrne’s (1961) MMPI-based scoring 
systems, both of which purport to measure 
the defensive style continuum of repression- 
sensitization, could as easily be interpreted in 
terms of a social desirability or acquiescence 
response set. Byrne (1964) responds to this 
criticism of the repression-sensitization di- 
mension by indicating that he expects that 
“individuals who utilize repressive mechanisms 
should stress socially desirable characteristics 
while sensitizing individuals should emphasize 
socially undesirable characteristics [p. 198].” 
Byrne also states that unpublished research 
at the University of Texas has shown a cor- 
relation of only —.37 (p < .01) between the 
Marlowe-Crowne Social Desirability Scale 
and the Byrne repression-sensitization scale 
in a sample of 115 students. Based on this 
study, the repression-sensitization scale does 
seem to be tapping a dimension which can- 
not be fully accounted for by reference to the 
social desirability variable. 

No studies have yet investigated the rela- 
tionship between repression-sensitization and 
acquiescence response set, or the tendency to 
agree (yeasay) or disagree (naysay) with 
items regardless of their content. However, 
on the basis of Couch and Keniston’s (1960) 
clinical description of yeasayers as individuals 
who are prone to agree with, act out, and in 
other ways yield to the pressure of stimuli 
exerted upon them and naysayers as individ- 
uals who tend to employ sublimation and 
many forms of denial in response to the pres- 
sures of internal or external stimuli, it would 
seem likely that a relationship does exist 


CAROL Z. FEDER 


between these two variables. Theoretically 
these two variables seem related, but to 
what extent is as yet undetermined. 

The purpose of this present study is to in- 
vestigate the relationship between the repres- 
sion-sensitization dimension and adjustment 
status, acquiescence response set, and social 
desirability. 

METHOD 
Subjects 


The Ss were 161 hospitalized male patients at the 
Veterans Administration Hospital, Manhattan, New 
York. Of this number, 83 were hospitalized on 
medical or surgical wards and 78 were hospitalized 
on one of the psychiatric wards. Only those medical 
and surgical patients were used who had no known 
previous history of psychiatric hospitalization, no 
terminal illness, no illness which was deemed by 
the attending physician to have a major psy- 
chiatric component, and no signs of organicity. All 
medical and surgical patients were also ambulatory _ 
to the extent that they could leave their beds and 
wards for a few hours, even though they might be 
in wheelchairs. Diagnoses of medical-surgical Ss are 
presented in Table 1. 

Only those psychiatric patients were used who 
had no signs of organicity, were judged to be not 
actively hallucinating, and had not undergone 
electric shock treatment in the 6 weeks prior to 
being tested. Diagnoses of psychiatric Ss are pre- 
sented in Table 2. 


TABLE 1 
DIAGNOSES OF MEDICAL-SURGICAL SUBJECTS 


Diagnosis Diagnosis n 
Hernia Biopsy of axillary 1 
Hemorrhoids nodes 
Diabetes Trigger thumb 
Dental caries Skin graft 
Rectal abscess Pyelonephritis 


Exophthalmos 


1 

1 

1 

Peripheral vascular i A 
Essential hypotension i 
1 

1 

1 

1 


disease 
Varicose veins 
Diabetic amputations 


Pilonidal cyst 
Peptic ulcer 


Gall bladder Fractured jaw 
Appendicitis Low back pain — 
Anemia Ligation of testicle 


vein 
Lipoma of the neck 
Lesion of chest wall 


Cell lesion of the nose 1 
1 
Fractured ankle i 
1 
1 
1 


Rheumatic heart 
disease 

Gastritis 

Polyp of colon 

Diverticulosis 


Fractured knee cap 
Fractured toe 


a 
= RENN NOYNNNNN sekaca]? 


Ulcer of leg Sigmoid tumor 

Cerebral vascular Cellulitis of leg 
insufficiency 83 

Hepatitis Total 
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TABLE 2 
DIAGNOSES OF PSYCHIATRIC SUBJECTS 


Diagnosis n 


Schizophrenic reaction, mixed or paranoid type | 31 


Depressive reaction x 26 
Anxiety reaction 11 
Alcoholism j 4 
Passive-aggressive personality disorder 3 
Obsessive-compulsive neurosis 3 

Total 78 


Adjustment-Maladjustment 


For the purposes of this study, the patients who 
resided on the medical or surgical wards were 
classified as “adjusted” and the patients who resided 
on the psychiatric wards were classified as “mal- 
adjusted.” After the elimination of surgical or med- 
ical patients with possible psychiatric involvement 
as potential Ss, it was felt that mere presence on 
the two different types of wards—medical-surgical 
versus psychiatric—provided prima facie evidence 
of the patients’ adjustment status. 

The Cornell Index (Weider, Mittelmann, Wechsler, 
Wolff, & Meixner, 1944) was also included as a 
Measure of adjustment status. The Cornell Index, 
Form N2, was developed and used by the Army 
to help screen out individuals with psychiatric 
disturbance and consists of 101 questions pertaining 
Mi possible psychological and psychosomatic prob- 
lems, 


Repression-Sensitization 


The MMPI was used to categorize repressors and 
sensitizers, with Byrne’s (1961) revision of the 
Altrocchi et al. (1960) scoring system employed, In 
order to obtain a cutoff point for repressors and 
Sensitizers, the repression-sensitization scores of all 
161 psychiatric and medical patients were placed 
ìn a frequency distribution. Cutoff points were 
established which classified 40% of the Ss as 
repressors and 40% as sensitizers. 


Social Desirability 


The Marlowe-Crowne-Social Desirability Scale 
(Crowne & Marlowe, 1960) consists of 33 items 
which were drawn from a universe of behaviors 
which are culturally sanctioned and approved but 
improbable of occurrence. The behaviors reflected in 
the items have only minimal implications for path- 
ology or abnormality whether responded to in the 
Socially desirable or undesirable direction. Of the 
items, 18 are keyed “true” and 15 are keyed “false,” 
largely eliminating a response-set interpretation of 
Scores. This scale was used to define the Ss’ need 
to respond in a socially desirable manner. 


403 


Acquiescence Response Set 


The S’s tendency to agree (yeasay) or disagree 
(naysay) with items in a personality inventory, re- 
gardless of content, was assessed by his score on the 
15-item Couch and Keniston Acquiescence Response 
Set Scale (Couch & Keniston, 1960). 


Procedure 


All medical patients who met the initial criteria 
for inclusion were individually asked if they would 
volunteer to participate in a study being conducted 
by the psychology department. Psychiatric patients 
who met the initial criteria were told that the 
testing session was just part of the routine psy- 
chological tests which are given petiodically by the 
psychology department and were given an appoint- 
ment to be tested, 

Medical and psychiatric patients were tested 
separately. The patients were tested in groups, 
usually comprising not less than three nor more than 
six patients. 

All of the patients took the following tests, some 
of which were given for the purposes of another 
research project: (a) a questionnaire designed to 
elicit information regarding social competence; (b) 
the Shipley Institute of Living Scale for Measuring 
Intellectual Impairment; (c) the Cornell Index; 
(d) a Q sort of 44 trait-descriptive adjectives to 
be sorted first for real self and then for ideal self; 
(e) the 15-item Couch and Keniston Acquiescence 
Response Set Scale, which was labeled Personal 
Reaction Inventory 1; (f) the Marlowe-Crowne 
Social Desirability Scale, which was labeled Per- 
sonal Reaction Inventory 2; and (g) the MMPI. 

The questionnaire was always given first and the 
MMPI was always given last. The order of the 
other tests was randomized. Testing was usually split 
into two sessions, with the MMPI being given in 
the second session. 


RESULTS 


A chi-square test was performed to investi- 
gate the relationship between the repression- 
sensitization dimension and the adjustment- 
maladjustment dimension. As can be seen in 
Table 3, the medical subjects were predomi- 
nantly repressors and the psychiatric subjects 
were predominantly sensitizers (x* = 28.24, 
p < 01). It appears that the defensive styles 
of repression and sensitization do not occur 
with equal frequency in both medical (“ad- 
justed”) and psychiatric (“maladjusted”) 
populations. 

An analysis of variance was used to investi- 
gate differences in the Cornell Index scores 
as a function of adjustment status and de- 
fensive style. Results show a significant psy- 
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TABLE 3 


RELATIONSHIP OF REPRESSION-SENSITIZATION 
AND ADJUSTMENT STATUS 


Repressor 
Status Repressor | Sensitizer afr 
Sensitizer 
Medical, adjusted | 49(34)* 19 (34) 68 
Psychiatric, 15(30) | 45(30) 60 
maladjusted 
Total 64 64 128 


Note.—x? = 28.24, df = 1, p $ .01. 
* Expected frequencies appear In parentheses, 


chiatric- medical-status main effect (F = 
22.50, p < .01), a significant repression-sensi- 
tization main effect (F = 69.42, p< .01), 
and a significant Repression-Sensitization X 
Psychiatric- Medical-Status interaction (F = 
5.49, p < .05). Table 4 shows that the Cor- 
nell Index scores of psychiatric sensitizers are 
significantly higher (p < .01) than the scores 
of any other group; the scores of medical 
sensitizers are significantly higher than the 
scores of medical repressors (p < .01), but 
the scores of psychiatric repressors are not 
significantly higher than the scores of medical 
repressors, 

In order to test the hypotheses that there 
would be relationships between the repres- 
sion-sensitization dimension, social desirabil- 
ity, and acquiescence response set, three 
Pearson r’s were computed.? Results show 
that the correlation between repression-sensi- 
tization scores and scores on the Marlowe- 
Crowne Social Desirability Scale (n = 136) 


2 All 161 subjects were administered the measures 
of social desirability and acquiescence response set; 
however, since some subjects failed to complete one 
or both of these, the correlations are based on 
slightly reduced Ns. 


TABLE 4 
COMPARISON OF MEAN SCORES ON CORNELL INDEX 


Subject Psychiatric Medical t 
Sensitizer 37.40 21.60 5.01* 
Repressor 13.60 8.25 1.70 
t Cie 4.24* 

*p <01. 
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was —.45 (p<.01). The correlation be- 
tween repression-sensitization scores and 
scores on the Couch and Keniston Acquies- 
cence Response Set Scale (m= 138) was 
+.37 (p < .01). A correlation was also com- 
puted between scores on the Couch and 
Keniston Acquiescence Response Set Scale 
and scores on the Marlowe-Crowne Social 
Desirability Scale (n = 135), and the result- 
ing correlation coefficient of —.08 was not 
significant (p > .05). 


Discussion 


The results of this study confirm the find- 
ings obtained by Ullmann (1962) and Byrne 
et al. (1965), with respect to repression- 
sensitization and adjustment status. Some of 
the previous evidence concerning the relation- 
ship between defensive style and adjustment 
was inconclusive because both variables had 
been defined by self-report paper-and-pencil 
measures. In this study maladjustment was 
defined by the empirical criterion of the sub- 
ject’s presence on the psychiatric ward of the 
hospital and adjustment by the criteria of 
the subject’s presence on a medical or surgi- 
cal ward and lack of a previous history of 
psychiatric hospitalization or major psychi- 
atric overlay to his present illness. A clear 
relationship was found between type of de- 
fensive style and adjustment status. A sig- 
nificant relationship between defensive style 
and adjustment seems to be established. 

As with previous studies using paper-and- 
pencil tests, defensive style strongly affected 
scores on the Cornell Index. Though the 
Cornell Index significantly discriminated psy- 
chiatric from medical patients, sensitizers 
were also found to get significantly higher 
scores than repressors within both the psy- 
chiatric and medical groups, and the scores of 
psychiatric and medical repressors were not 
distinguishable. In view of the tendency of 
sensitizers to admit illness and the tendency 
of repressors to deny illness, this finding 8 
not surprising; nor is it surprising in view 
of the finding that sensitizers are found to 
be more maladjusted than repressors. HOY 
ever, this finding would seem to make it ai 
perative that paper-and-pencil tests of it 
justment” be interpreted with caution. Unles 
one takes into account the defensive mode 0 


RELATIONSHIP oF REPRESSION-SENSITIZATION 


the individual who is responding to a test 
item, the test results will be distorted. An 
individual’s usual defensive mode of adapta- 
tion to threat along a repression-sensitization 
continuum should be thought of as a potential 
influence on any statement he may make 
about himself on the usual assortment of 
paper-and-pencil tests. 

The hypothesis that there is a relationship 
between the repression-sensitization dimen- 
sion and the need to respond to items in a 
socially desirable direction was confirmed. 
The correlation of —.45 (Ż < .01) which was 
found in this study between the repression- 
sensitization dimension and the Marlowe- 
Crowne Social Desirability Scale confirms the 
findings of other studies. Byrne (1964) found 
a correlation of —.37 (p< 01) between 
repression-sensitization and the Marlowe- 
Crowne scale in a group of 115 students. 
Silber and Grebstein (1964), using three dif- 
ferent samples, found correlations of —.32 (p 
< .02), —.48 (p < .01), and —.39 (p < .01) 
between Byrne’s repression-sensitization di- 
mension and the Marlowe-Crowne scale. It 
is obvious from these findings that a signifi- 
cant negative relationship exists between the 
Tepression-sensitization dimension and the 
Marlowe-Crowne Social Desirability Scale, 
with repressors indicating significantly more 
heed to respond in a socially desirable direc- 
tion than sensitizers. Whereas Christie and 
Lindauer (1963) indicate that the construct 
Of repression-sensitization and the construct 
of social desirability may be really one and 
the same, and therefore interchangeable, the 
results of this study, as well as those of 
Byrne (1964) and Silber and Grebstein 
(1964), indicate otherwise. 

Results of this study also confirm the hy- 
Pothesis that a relationship would be found 
between the repression-sensitization dimen- 
Sion and an acquiescence response set. A cor- 
relation of +.37 (p < .01) was found be- 
tween subjects’ scores on the repression-sensi- 
tization dimension and scores on the Couch 
and Keniston Acquiescence Response Set 
Scale, with sensitizers being more acquiescent 
than Tepressors. However, the size of the cor- 
relation indicates that repression-sensitization 
Scores can no more be interpreted merely in 
terms of an acquiescence response set than 
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they can be in terms of a social desirability 
dimension. Christie and Lindauer’s (1963) 
contention again does not seem to be sup- 
ported. The repression-sensitization dimen- 
sion is tapping something other than just the 
tendency to “yeasay” or “naysay.” With the 
insignificant correlation between the Mar- 
lowe-Crowne Social Desirability Scale and 
the Couch and Keniston Acquiescence Re- 
sponse Set Scale, a correlation of each of 
these variables with the repression-sensitiza- 
tion dimension of about 40, and the test- 
retest reliability of the repression-sensitiza- 
tion scale being .88, it appears that about 
one-quarter of the predictable variance of the 
Tepression-sensitization scores is contributed 
by the social desirability variable and about 
one-quarter by the acquiescence variable. This 
leaves about one-half of the variance unac- 
counted for by response-style measures. Fu- 
ture research will have to determine the spe- 
cific nature of the variance unaccounted for. 
However, it does appear that the repression- 
sensitization scale is not merely an equivalent 
form of the social desirability or acquiescence 
response set scale, but rather is measuring a 
rather complex and currently insufficiently 
defined dimension, 
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USE OF THE MINIMAL SOCIAL BEHAVIOR SCALE IN A 
NONPSYCHIATRIC POPULATION + 


E. R. OETTING anp CHARLES W. COLE 


Colorado State University 


This report is a study of the usefulness 


of the Minimal Social Behavior Scale 


with adult males of low average intelligence and socioeconomic status. 
37 Ss were tested. Of the 32 items, only 5 appeared to have sufficient varia- 
bility among Ss to be useful in evaluation of treatment outcomes, 


The Minimal Social Behavior Scale (MSBS), 
devised by Farina, Arenberg, and Guskin (1957), 
consists of 32 items which may be scored during 
an individual interview to yield an index of the 
S’s ability to respond appropriately to social 
stimuli. Ulmer and Timmons (1966) present evi- 
dence for the validity of the MSBS when used 
with institutionalized chronic schizophrenic pa- 
tients. Since significant mean increases were ob- 
served for patients in certain groups, the authors 
concluded that the MSBS may provide an objec- 
tive and reliable measure of behavior and behav- 
ioral changes of Ss in social interaction situa- 
tions and that the instrument may be of use in 
evaluating the effects of a variety of treatment 
Procedures, such as, psychotherapy, special educa- 
tion, rehabilitation, vocational training, etc. They 
indicated, however, that the present form of the 
MSBS may be less effective with Ss of average 
or higher intelligence levels. They suggest that 
the addition of a time score to four items may 
extend the sensitivity and applicability of the 
instrument. (This scoring is described by Ulmer 
and Timmons in an extended report.) 


METHOD 


The MSBS was administered to 37 nonhospitalized 
adult males who were currently in training under the 
Educational Opportunity Act. Age ranged from 23 
to 58, with a mean of 34.6. California Elementary 
Level Achievement Test scores range from a grade 
level of 2.0 to 7.6 with a mean of 5.8. The average 
grade completed was 7.0, ranging from no formal 
Schooling to completion of the tenth grade. Mean 
estimated Verbal IQ was 85.2 Each S was assigned 
— 


* This investigation was supported, in part, by a 
United States Public Health Service Research Grant 
3 11-Mh-00458 (04) from the National Institutes of 

ealth, 


_* These data were supplied through the coopera- 
tion of the Laramie County Welfare Agency, Chey- 
enne, Wyoming. 


on a random basis to either of two examiners, both 
of whom were experienced in individual testing. 


RESULTS AND DISCUSSION 


While the time score did increase the variabil- 
ity of the timed items, the findings indicate that 
the MSBS will require considerable revision if 
it is to be used with Ss of even low ability. Of 
the 32 items, only 10 showed any variability. Of 
the items, 21 were scored plus for all 37 Ss; 1 
item was missed by every S. This latter item is 
one in which Æ feigns a headache by rubbing his 
head, etc, The item is scored plus if the S makes 
a verbal response which refers in some way to 
“head” or “pain.” Another five items were missed 
by only one or two Ss. The remaining five items 
(9, 14, 18, 19, and 23) accounted for nearly all 
the variability in total score, 

These five items have in common the factor 
of socially accurate interpretation of the stimulus 
presented by Æ, and all but one require a spon- 
taneous response on the part of S. In general, 
they have high face validity as measures of so- 
cial responsivity. Item 9, for example, is scored 
plus if, when Æ “accidently” pushes a pencil off 
the desk, the S picks up the pencil without being 
asked. Item 14 is somewhat different in that the 
S is required to complete a simple task. The Æ 
offers a pencil to the S and says, “I would like 
you to copy this drawing on this paper.” Item 
14 is scored plus if the subject draws any four- 
sided figure with diagonals, (All Ss in the present 
sample accepted the pencil and made at least some 
mark on the paper, thus earning credit on both 
Items 12 and 13.) Items 18 and 19 involve two 
types of subject behavior to the same stimulus 
situation. For these items, Æ crumples a sheet of 
paper, tosses it at the waste basket, purposely 
missing, and says, “Darn it, missed again!” The 
S receives a plus on Item 18 if he smiles or laughs 
in response to Z’s exclamation. If he spontane- 
ously picks up the paper and deposits it in the 
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waste basket, he is given a plus for Item 19. In 
Item 23, a book of matches has been placed on 
the desk within reach of S. The Æ places a cigar- 
ette between his lips, fumbles for matches, stands 
up and pats his pockets. If the S offers Æ a light, 
calls attention to the matches, or hands them to 
E, he receives a plus, 

It may be concluded that the present form of 
the MSBS is not difficult enough for use with 
other than a severely disturbed or mentally de- 
ficient population. The 5 items of the 32 which 
did show variability are not highly intercorre- 
lated and may provide some predictive variance 
or may be useful in the evaluation of treatment 
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effects; however, when used with normal or near- 
normal Ss, the majority of the test items appear 
to be inadequate. 
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MALE-FEMALE DIFFERENCES IN MMPI EGO STRENGTH: 


AN ARTIFACT 


DAVID S. HOLMES 


University of Texas 


This research tested the hypothesis that the lower Es scores of women were not 
a function of a general tendency for women to admit to more pathology than 
men but rather were due to a number of items on the scale which seemed to 
measure sex role identification. An item analysis indicated that the Es differ- 
ences were a function of a number of specific items which in most cases were 
related to sex role. Es scores significantly predicted speed of response to 
psychotherapy. When the tests were rescored without the items to which men 
and women had responded differentially, the male-female differences in Es were 
canceled out, but the predictive validity of Es in regard to psychotherapy was 


not affected. 


A consistent finding in the research on the 
Ego Strength (Es) scale of the MMPI (Barron, 
1953) is that men achieve higher scores than 
do women (Distler, May, & Tuma, 1964; 
Getter & Sundland, 1962; Hathaway & Briggs, 
1957; Korchin & Heath, 1961). Distler, May, 
and Tuma (1964) explained this finding on the 
basis of perceived sex roles: “For patients from 
a predominantly lower-middle, and lower socio- 
economic background, the perceived female role 
permits a disintegrated admission of symptoms 
and bid for dependency and care, whereas the 
perceived male role demands the maintenance of 
a facade of strength and control even in the 
face of the obviously upsetting circumstances 
of hospitalization [p. 175].” It could also be 
suggested that, independent of the symptom- 
admission threshold, women might generally have 
less ego strength. 


Inspection of the items which make up the 
Es scale indicated that most items were méea- 
sures of psychopathology and were scored for 
Es when the pathology was denied. Some items, 
however, seemed primarily to reflect sex role 
identification or characteristics. For example; 
some items on the Es scale were also scored on 
the Masculinity-Femininity (Mf) scale and were 
not scored on any of the original clinical scales. 
It may have been then that sex, relatively inde- 
pendent of pathology, was responsible for the 
differences in Es scores between men and woe 
In the development and original validation ° 
the scale (Barron, 1953), the populations were 
not broken down by sex, and the item overlap 
with the Mf scale was not reported. Therefore, 
a possibly differential response due to sex © 
not have been determined. dnis- 

If the feminine role permits greater 4 
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sion of symptoms, or if women actually have 
less ego strength, and if it is either of these 
factors which results in the Æs score difference 
between men and women, we would expect a 
general elevation of symptom admission over all 
items on the Es scale by women. On the other 
hand, if the scale score differences were a result 
of the inclusion of items measuring sex role 
identification, we would predict that the differ- 
ence in Es scores would be a function of a male- 
female differential response only on Specific 
items, and that these would be items which are 
relevant to sex role identification and character- 
istics. If this proved to be the case, it might 
be expected that prediction of response to psy- 
chotherapy, the purpose for which this scale was 
designed, would be improved by the omission 
of the sex-role-related items. 


METHOD 


Subjects were 38 successive admissions, all literate, 
to the Massachusetts Mental Health Center, a 200- 
bed state hospital located in Boston, Massachusetts. 
There were 17 women and 21 men. The MMPI in 
booklet form was administered to all subjects within 
the first 6 days of hospitalization. Their responses 
were scored for the three validity scales, the nine 
original clinical scales, and the Es scale. Raw scores 
were employed in the analyses.t 

Each patient was judged on his or her speed of 
therapeutic response by the treating psychiatrist. 
This psychiatrist classified each patient he was treat- 
ing as (a) a slow responder, (b) an average re- 
sponder, or (c) a fast responder. The classification 
was done without knowledge of the MMPI scores. 
Treatment consisted of individual psychotherapy 
three times per week, in addition to the various 
therapeutic adjuncts available in the hospital. 


REsuLTs AND DISCUSSION 


When men and women were compared over 
all scales, men were found to have significantly 
higher Mf and Es scores (t= 4.73, 2.79, respec- 
tively, df = 36). Women had elevated Hs and 
Hy scores (t = 3.31, 3.70, respectively, df = 36). 
More importantly, however, with regard to the 
level of admitted pathology, the men and women 

d not differ on the K scale (¢= .22, df = 36), 
4 scale which is usually considered to be a mea- 
Sure of defensiveness or general pathology. Since 
it was therefore doubtful that women admitted 
to more pathology in general, a tendency which 
Would have lowered the Hs scores of women as 
à result of greater symptom admission over all 


“the author is indebted to Luther Distler and 
Phillip May for pointing out that raw scores 
tather than profile T scores should be used, since 
the T scores are adjusted to decrease the level of 
‘ymptom admission of women. 
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items on the Es scale, an item analysis was 
performed to determine whether Specific items 
might account for the lower Es scores of women. 

Comparison of the response frequency of 
men versus women over each of the 68 items 
which make up the Es scale revealed that on 
8 items men responded significantly (p < .02) in 
the scored direction more frequently than did 
women. This finding supported the hypothesis 
that the difference in Es scores was not a 
function of a general tendency for women to 
admit to symptoms (i.e., respond in the non- 
scored direction) over all items, but was rather 
a function of specific items, In determining the 
Proportion of overlap between the Es and Mf 
scales for these items, only the first 366 items 
of the total test could be considered, since the 
last 200 items are not scored on any of the 
original scales and membership on the Mf scale 
would be impossible. Five of the items to 
which men and women responded differentially 
appeared in the set of items which overlapped 
with the items on the original clinical scales 
(Table 1, Part A), and three of these items 
were also scored on the Mf scale. In each case, 
as was predicted, the overlapping items were 
scored so that a response which increased the 
Es score also indicated increased “masculinity” 
on the Mf scale. For example, if a subject indi- 
cated that he did not like to cook (No. 140) 
or would not like to paint flowers (No, 261), 
he received a higher Zs score. One other item 
(No. 132), “I like collecting flowers or growing 
house plants.” [F], on which the differential 
response approached significance (x? = 3.47, 
df = 1) also overlapped with the Mf scale and 
was scored in the same direction for masculinity. 
These results clearly suggested that for this set 
of items, sex role identification independent of 
pathology played a large role in determining 
the Es scores. 

Since Items 488, 510, and 548 were not scored 
on the original scales, it was difficult to evaluate 
them in terms of their “pull” on the basis of 
sex. However, Drake (1953), using a student 
population, found a significant differential re- 
sponse on the basis of sex to Item 488 (“I pray 
several times a week.”), and census data (Gold- 
field, 1964) indicates that women are more active 
in religious activities than men. Subjectively, the 
remaining two items, especially Item 548, 
seemed to have a sex component, but there 
was no independent empirical support for this 
contention. 

It was concluded from the above findings 
that the higher Zs scores of men, as opposed to 
women, were not a function of a general 
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Irems To WuicH MEN RESPONDED MORE 
FREQUENTLY IN SCORED DIRECTION 
THAN Dip WOMEN 


Item 
NG: Item x= 
A. Scored on original scales 
140. | I like to cook. (F) 7.90 
153. | During the past few years I have been 
well most of the time. (T) 8.41 
174. | I have never had a fainting spell. (T) | 6.67 
187. | My hands have not become clumsy or 
awkward. (T) 11.79 
261. | IfI were an artist, I would like to draw 
flowers. (F) 6.42 
B. Not scored on original scales 
488. | I pray several times a week. (F) 8.23 
510. | Dirt frightens or disgusts me. (F) 8.44 
548. | I never attend a sexy show if I can 
avoid it. (F) 6.53 


Note,—Abbreviated: F = false; T = true. 
a In all cases, df = 1, 


tendency for women to admit more symptoms, 
but due rather to a number of specific test 
items which were measures of a masculine role 
identification. Rescoring the tests without these 
items yielded male and female scores which did 
not differ significantly (ż= 1.76, df = 36). It 
should be noted that these results do not neces- 
sarily refute the hypothesis that women tend 
to admit to more pathology. Rather, these results 
suggest that it is not this tendency which causes 
the difference in Æs scores between men and 
women, 

The correlation between the Es scores and the 
ratings of speed of response to therapy for all 
patients was .34 (df = 36, p < .025), which indi- 
cated that high Zs scores were significantly re- 
lated to a more rapid response to therapy. (The 
correlations for the male and female groups were 
34 and .26, respectively.) When the tests were 
rescored, and the eight items to which men and 
women had responded differentially were omit- 
ted, the correlation between Es and speed of 
response to therapy was .32 (df = 36, p < .025). 
(The correlations for the male and female groups 
were .32 and .25, respectively.) The correlations 
found when using the original and shortened 
Es scales did not differ significantly (¢=.84, 
df = 35). These results suggest that while the 
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elimination of the sex-related items did not im- 
prove the predictive ability of the scale, their 
absence did not appreciably diminish the predic- 
tive power. The drop in the correlation of .02 
seemed to be a small price to pay for the elimi- 
nation of the sex-related items, items which lead 
to Es score differences between men and women 
which in turn could lead to possibly erroneous 
conclusions. 

The fact that the eight items identified in the 
present study did empirically differentiate be- 
tween the sexes offers strong support for the 
contention that these items had a large sex com- 
ponent. Considering the number of comparisons 
made and the level of statistical significance 
used, only one of these differences would be 
expected on the basis of chance. The fact that 
all of the items identified by Drake (1953) as 
being sex related were not also found to dis- 
criminate between the sexes in the present study 
is probably due to differences in populations 
sampled (college students versus state mental 
hospital patients). It was not the purpose of the 
present paper to offer a “revised Es scale” for, 
needless to say, further refinements and cross- 
validations would be necessary before arriving 
at that point. Rather, the aim of this paper 
was to suggest a new interpretation of the male- 
female differences on the Æs scale which might 
be taken into consideration when using the test 
in research or practice. 
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ROLE OF MATERNAL PSYCHOPATHOLOGY IN MALE AND 
FEMALE SCHIZOPHRENICS + 


G. GAIL GARDNER 2 
Teachers College, Columbia University 


Selecting from records of a child guidance clinic, the psychopathology of 165 
mothers of male and female children and adolescents was studied in order to 
investigate sex differences in the degree of relationship between mother’s men- 
tal status and the child’s later hospitalization for schizophrenia. Of the chil- 
dren, 108 were later hospitalized for schizophrenia, while 57 achieved an ade- 
quate adjustment in areas of work and interpersonal relationships. For the 
girls, those who became schizophrenic had severely disturbed mothers signifi- 
cantly more often than did those who achieved an adequate adjustment (p< 
.02). The relationship between severity of maternal psychopathology and the 
child’s later mental status was not significant for boys, The results were con- 
sistent with the findings of studies on sex role identification. 


Many studies link the development of schizo- 
phrenia to individuals’ early experiences, with 
particular emphasis on the mother-child relation- 
ship and other aspects of maternal behavior, yet 
the evidence has never been conclusive. It is thus 
necessary to look for additional relevant varia- 
bles—either in the environment or in the pre- 
schizophrenic individual—which might influence 
the effect of various maternal behaviors on the 
child. 

Researchers have usually overlooked the possi- 
ble importance of the sex of the child. Most 
Studies focus either on combined groups of males 
and females or on one sex with the assumption 
that the findings are releyant to both sexes. 

A few studies, in which data were analyzed 
Separately for males and females, found differ- 
ences with respect to the types of maternal be- 
havior which were related to later development 
of schizophrenia. Fleck, Lidz, and Cornelison 
(1963) described the mothers of schizophrenic 
females as aloof and unable to invest emotion in 
the relationship with their daughters. The moth- 
ers of schizophrenic males were described as de- 
Pendent upon their sons for emotional satisfac- 
tion, failing to set ego boundaries between them- 
Selves and their sons, engulfing, and seductive. 


*This paper is based on a dissertation submitted 
to the Department of Psychology, Teachers College, 
Columbia University. The author is grateful to the 
dissertation committee: R. A. Schonbar, W. N. 
Thetford, and particularly D. F. Ricks. This research 
was part of the Schizophrenia Research Project of 
the Judge Baker Guidance Center, Boston, Massa- 
chusetts, and it was supported in part by research 
Grant MH 10,466 from the National Institute of 
Mental Health, United States Public Health Service. 

2 Now at Cornell University Medical College, 525 
East 68th Street, New York, New York 10021. 


Cheek (1964) found that the mothers of schizo- 
phrenic males rated high on measures of tension 
and hostility, while the mothers of schizophrenic 
females rated low on these variables in compari- 
son with mothers of male and female controls, 
Her finding for females is probably a variant of 
the aloofness noted by the Fleck group. Sobel 
(1961) focused on quantitative rather than 
qualitative variables in his study of a small group 
of children raised by psychotic mothers. He con- 
cluded that girls were more vulnerable than boys 
to the effects of serious psychopathology in the 
mother. However, none of the children in his 
study was described as schizophrenic. 

The studies by Cheek and the Fleck group both 
suffer from the fact that the observations were 
made after the time of hospitalization for schizo- 
phrenia. The investigators were thus in the un- 
fortunate position of having to assume that their 
observations actually reflected what took place 
in the patients’ early histories. In the present 
study, it was possible to gather accounts written 
during the patients’ childhood, prior to hospitali- 
zation for schizophrenia. Since the accounts of 
male and female children who became schizo- 
phrenic and those who became normal were 
studied without knowledge of the ultimate diag- 
nosis, this research was an improvement over 
previous investigations in terms of rater objec- 
tivity as well as the relevance of the behavior 
ratings. 

METHOD 
Subjects 

The Ss were 165 children who were seen at the 
Judge Baker Guidance Center, Boston, Massachu- 
setts, These children were selected from the files of 
approximately 18,000 children seen at the center since 
its inception in 1917. By cross-checking these 18,000 
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files with those of the Massachusetts Department of 
Mental Health, it was possible to identify children 
who were later hospitalized with a diagnosis of 
schizophrenia. It was also possible to identify a 
group of control children who had been seen at the 
center but who had never been in a mental hospital 
or prison. Follow-up in the summer of 1965 indi- 
cated that the controls were making adequate ad- 
justments in areas of work and interpersonal rela- 
tionships. 

In the preschizophrenic group there were 60 males, 
of whom 32 became chronic schizophrenics and 28 
were eventually released from the hospital and able 
to function in the community. There were 48 fe- 
males, with 25 chronic and 23 released. The con- 
trol group contained 28 males and 29 females. 

The preschizophrenic groups were first hospitalized 
predominantly in their early 20's. For both males 
and females, the time spent in a mental hospital was 
significantly greater for the chronic group than for 
the released group. 

Within each of the three diagnostic groups 
(chronic, released, control), the males and females 
were matched on several demographic variables 
scored from the Judge Baker records, including so- 
cial class, IQ, and ethnic background. There were no 
sex differences for any diagnostic group in the era 
when the children were seen at Judge Baker (1917- 
1958) or in the age at which they were referred. 
Both males and females in all diagnostic groups 
were seen primarily between age 13 and 17 years. 
The total age range was 6-22. The age of the Ss in 
1965 was predominantly in the range of 35-50 years. 

Though the children were usually not given a 
psychiatric diagnosis when they were seen at the 
center, their presenting symptomatology was noted 
in detail. Both children who became schizophrenic 
and those who became controls were referred to 
the center for a wide variety of symptoms, includ- 
ing learning disabilities, inability to make friends, 
aggressive behavior at home and in the community, 
neurotic traits, and psychotic manifestations, The 
presence of psychotic symptoms, including autistic 
thoughts and behavior, hallucinations, delusions, dis- 
orientation in time and space, and bizarre speech 
patterns, was strongly associated with later hos- 
pitalization for schizophrenia for both males and 
females, though these symptoms characterized only 
29% of the preschizophrenic group. Certain neurotic 
traits were associated with later schizophrenia in 
males but not in females, and this finding is dis- 
cussed in detail in another paper currently in prepa- 
ration. 


Procedure 


For the purpose of this study, the mother was 
defined as the female adult with whom the child 
had spent the greatest amount of time; in most 
cases this was the biological mother. There was no 
effort to restrict the sample to children whose fami- 
lies were intact, as it was felt that some degree of 
family disruption might often be related to psycho- 
logical disturbance in the child and to later schizo- 


Notes AND CoMMENTS 


TABLE 1 


NUMBER oF MorHers oF CHRONIC, RELEASED, AND 
SOCIALLY ADEQUATE MALES WITH SEVERE, Mon- 
ERATE, MILD, or No PSYCHOPATHOLOGY 


Degree of maternal psychopathology 


Insuf- 
Diag- Mild ficient 
nostic Mod- or infor- 
group Severe erate normal mation Total 
Chronic 8 9 11 oa 32 
Released 4 7 13 4 28 
Control 2 3 19 4 28 
Total 14 19 43 12 88 


phrenia. If children from disrupted families had 
been dropped from the sample, there would have 
been a selective bias factor of unknown magnitude. 

On the basis of all data available in the guidance 
center record, each mother was scored as being pre- 
dominantly in one of the following categories: (a) 
severe pathology—overtly psychotic, schizoid, bor- 
derline character disorder; (b) moderate pathology— 
depressed, dependent, impulsive; and (c) mild pa- 
thology or normal—compulsive, normal, or mild 
neurosis. Detailed criteria for the specific categories 
have been described in previous research by Waring 
and Ricks (1965) and by Nameche, Waring, and 
Ricks (1964). 


RESULTS 


The number of mothers characterized by each 
of the three degrees of psychopathology in each 
of the three diagnostic groups is presented in 
Tables 1 and 2 for males and females, respec- 
tively. The relationship between degree of ma- 


TABLE 2 


NuMBER oF MOTHERS or CHRONIC, RELEASED, AND 
SOCIALLY ADEQUATE FEMALES WITH SEVERE, 
Moperate, Mizp, or No 
PSYCHOPATHOLOGY 


Degree of maternal psychopathology 
EEE E a S L 55 


Insuf- 
Diag- Mild ficient 
nostic Mod- or infor- 1 
group Severe erate normal mation Tota 
Chronic 11 7 4 3 25 
Released 5 6 6 6 23 
Control 1 11 12 5 29 
Total 17 24 22 14 77 


Notes AND COMMENTS 


ternal pathology and the children’s later mental 
status was measured by the chi-square technique 
(df =4, two-tailed test), and the magnitudes 
were 8.84 for males (ms) and 12.94 for females 
(p <.02). 


Discussion 


The results indicate that schizophrenia in 
women is related to serious psychopathology in 
the patients’ mothers. There appeared to be some 
relationship for males, but it was less pronounced. 
These results are consistent with the findings of 
Sobel (1961) and Rosenthal (1962) who noted 
that schizophrenia occurs more often among same- 
sex relatives than among opposite-sex relatives. 

The finding that disturbance in the mother is 
likely to have more serious consequences for a 
female child than for a male child may be ex- 
plained by theoretical models focusing on genetic 
transmission, environmental impact, or some com- 
bination of both. In terms of environmental ex- 
planations, it may be important that girls usually 
identify with their mothers throughout childhood, 
while boys shift away from their mothers to 
model themselves after their fathers or, at least, 
after a cultural stereotype of masculinity (Cam- 
eron, 1963), 

The data in the present study suggest that a 
boy might develop serious psychopathology if he 
is unable to shift away from the original identi- 
fication with the mother. For example, one boy 
who later became a chronic schizophrenic had an 
ineffective father and a mother who perceived 
the child as an extension of her own ego. She 
dressed him in girls’ clothes until he was 5 years 
old, and when he was 15 he still sat on the toilet 
to urinate. The boy’s fears and fantasies revealed 
4 strong pull toward a feminine identification. 

As the above example indicates, the strength of 
a child’s identification with a parental figure is 
partly a function of the degree to which the 
Parent identifies with the child. In the present 
Study, for both schizophrenic and control groups, 
the mothers identified themselves significantly 
More often with the girls than with the boys.* 


®The mother was considered to identify herself 
with her child when she made explicit statements 
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Thus while the mother’s identification with her 
daughter (as opposed to seeing the child as an 
individual in her own right or identifying the 
child with someone else) does not in itself seem 
related to later pathology in the child, there may 
be pathological consequences if the mother who 
fosters the child’s identification with her shows 
evidence of serious emotional disturbance, 

In one case, a psychotic mother identified 
strongly with her 9-year-old daughter, She had 
sat in school with the girl every day in first 
grade and still slept with her frequently, The 
child expressed her desire to escape from intru- 
sion by stating, “I would like to hide somewhere 
where no one could ever find me.” Ironically, 
her wish was granted, for she has been in a men- 
tal hospital the greater part of the last 20 years, 
and it is no longer possible to establish meaning- 
ful contact with her. 


concerning similarity between herself and her child 
with respect to personality traits and symptom pat- 
terns, or more general statements that she and the 
child “are just alike.” 
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ATTITUDES INFERRED FROM NEUTRAL VERBAL COMMUNICATIONS? 


ALBERT MEHRABIAN 
University of California, Los Angeles 


The dimension of immediacy in verbal communication quantifies degrees of the 
intensity and directness of interaction between a communicator and the object 
of his communication. A previous study indicates that there is more immediacy 
in communications about liked others than in communications about disliked 
others. In the present study Ss, who were not familiar with the explicit 
immediacy categories and associated inferences about attitudes, were presented 
with pairs of communications differing in degree of immediacy, but neutral in 
explicit communication of attitudes. It was predicted that in the case of 
ostensibly neutral communications the more immediate communication in each 
pair would be judged as indicating a more positive quality of communicator 
attitude. The prediction was confirmed in 6 of the 7 immediacy categories 


investigated. 


A common assumption about psychologically 
maladjusted people is that they cannot “explic- 
itly” communicate their positive or negative at- 
titudes towards others (e.g., Bateson, Jackson, 
Haley, & Weakland, 1956; Deutsch & Murphy, 
1955; or Rogers, 1959). Explicit communications 
of attitudes can be defined as those which con- 
tain references to positive or negative affect, 
evaluation, or preference in the denotative mean- 
ing of the words used in the communications. 
Such communications are exemplified by “I enjoy 
seeing X,” or “I don’t want to see X,” in con- 
trast to “I don’t know X too well.” When atti- 
tudes are not explicitly communicated, it is as- 
sumed that they are communicated implicitly in 
some other way (e.g., tone of voice, facial ex- 
pression, gesture, speech rate, or postural orien- 
tation). While a number of investigations have 
shown that there is above-chance accuracy in the 
inference of attitudes or feelings from implicit 
communications, it has been difficult to specify 
how such inferences are made (e.g., studies re- 
viewed by Davitz, 1964; or studies reported in 
Knapp, 1963), In other words, the particular 
patterns of cues that are consistently related to 
different kinds of communicator attitude or 
feeling are not known. This difficulty is partially 
due to the lack of a reliable system for classify- 
ing and recording implicit communications, The 
classification systems which have been proposed 
(e.g., Birdwhistell’s, 1952, categories for the de- 
scription of movement or Pittenger & Smith’s, 
1957, summary of categories for the description 
of paralinguistic data) have not been successfully 
related to communicator attitudes or feelings 
(e.g., Dittmann & Wynne, 1961). 


1 This study was supported by University of Cali- 
fornia Grant 2189, 


Implicit verbal communications of attitude or 
feeling (e.g., speech rate or errors) are easier to 
study, since speech can be recorded without any 
special terminology or training. Thus, when ex- 
plicit information about attitudes is absent in a 
verbal message, it may be possible to use some 
implicit verbal cues to infer a communicator’s 
attitude or feeling. Marsden’s (1965) review of 
the content-analysis literature indicates that al- 
though there are a few exceptions, such attempts 
are generally lacking. Exceptions are Mahl’s 
(1959) use of implicit verbal cues to infer the 
feeling of anxiety and Gottschalk, Gleser, and 
Springer’s (1963) use of such cues to infer the 
attitude of hostility. 

Recently, Mehrabian and Wiener (1966) pro- 
posed a method of “Immediacy” analysis for 
inferring communicator attitudes from verbal 
data. Immediacy is defined as the degree of di- 
rectness and intensity of interaction between 4 
communicator and the object of his communica- 
tion. For example, in the comment, “I like this 
(that) piece Jane is playing,” the use of “this 
is considered a more immediate reference to the 
piece of music than the use of “that.” In the 
comment, “We went to visit Mary and Jack, 
the communicator’s immediacy to Mary is con- 
sidered to be greater than his immediacy to Jack. 
The Method section which follows contains addi- 
tional examples of variations in communication 
immediacy. Mehrabian and Wiener (1966) found 
that there is more immediacy in written com- 
munications about liked people or sucet 
events than in communications about dislike 
people or failures. In other words, trained ob- 
servers can use immediacy to infer communicator 
attitudes, j) 

If the immediacy level of a verbal communi- 
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cation can be used by trained observers to infer 
the degree of positive communicator attitude, it 
is possible that untrained observers can also use 
immediacy to infer attitudes. In a preliminary 
study of this problem, Mehrabian (1966) pre- 
sented Ss with pairs of verbal communications 
about a person, object, or event. The communi- 
cations in each pair differed in degree of immedi- 
acy, but did not differ in the degree of explicit 
attitude expressed towards the object of com- 
munication (e.g., “X is my neighbor” versus “X 
and I live in the same neighborhood”), For the 
five immediacy categories investigated, Ss inter- 
preted the more immediate communication in 
each pair as indicating a more positive attitude, 
The present study was designed to investigate 
whether Mehrabian’s findings would hold for the 


415 


seven other immediacy categories, Accordingly, it 
was hypothesized that untrained Ss judge the 
more immediate of two explicitly neutral com- 
munications as indicating a more positive atti- 
tude. 


METHOD 
Subjects 


The Ss were 92 University of California under- 
graduates. 


Materials and Procedure 


The entire experiment was group-administered in 
one session. The Ss received a page of instructions, 
a booklet of communication pairs, and an answer 
sheet on which to record their judgments. The in- 
structions read as follows: 


TABLE 1 
DEFINITIONS AND EXAMPLES OF IMMEDIATE AND NoNIMMEDIATE COMMUNICATIONS 


Category 


Example 


Distance: Demonstrative pronouns such as “the,” 
“this,” or “these,” instead of “that” or “those” 
are used to refer to the object of communication. 


Time: The relationship between the communi- 
cator and the object of communication is on- 
going or present, instead of being temporally 
past or future. 


Order of occurrence: The earlier references in a 
Sequence of references to several objects are 
Considered more immediate, 


Duration: Longer communications about an 
object are considered more immediate than 
shorter ones, 


Activity-Passivity: The communicator’s inter- 
action with the object of communication is 
Stated as being voluntary rather than forced. 


Mutuatity-Unilaterality: Communications in 
which there is greater reciprocity between com- 
municator and the object of his communication 
ate considered more immediate. 


Probability: Communications in which the com- 
Municator’s interaction with the object of com- 
„munication is more certain are considered more 
Immediate, 


: I don’t understand these Canadians. 
: I don’t understand those Canadians. 
: I’ve seen this clerk before. 
: I’ve seen that clerk before. 


: John and I go fishing regularly. 

: John and I will go fishing regularly. 
Mike is showing me his house. 

: Mike showed me his house. 


: We went to visit Jack and Jane. 

: We went to visit Jane and Jack, 

: I got the check from Pete when I went to the 
office. 

: When I went to the office, I got the check from 
Pete. 


> Bee >W» Poop 


: I saw Vincent mowing his lawn. 

: I saw Vincent. 

Gloria came to my party wearing a straw hat. 
Gloria came to my party. 


; I want to help that woman. 

: I must help that woman. 

: I’m going to write a letter to Joe. 
: I have to write a letter to Joe. 


: Dave and I go for rides. 
: I go for rides with Dave. 
: Arlene and I play cards. 
: Arlene plays cards with me. 


: Bob and I get along. 

: Bob and I can get along. 

: We will do business with Joe. 
A: We may do business with Joe. 


w> PHOS Pew >w» 
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TABLE 2 


MEAN AGREEMENT FREQUENCY AND CORRESPONDING 2 SCORE FOR THE SEVEN 
Tepracy CATEGORIES 


Order of Activity- | Mutuality- 
Distance Time occurrence Duration Passivity Unilaterality Probability 
Mean agreement frequency 0.65*  0.66* 0.88* 0.54 0.81* 0.66* 0.83* 
8 score 2.88 3.08 731 0.77 5.96 3.08 6.35 
*p <.01, 


In the stapled booklet you will find 35 pairs of 
statements by two speakers (speaker A and speaker 
B) about an identical experience involving a per- 
son, object, action, or event. Please keep in mind 
that we are assuming that both speakers are 
SHARING AN IDENTICAL EXPERIENCE. In 
each pair of statements, a person’s name, an ob- 
ject, an action, or an event has been underlined. 
For each pair of statements, we would like you 
to INDICATE THE SPEAKER (A or B) whom 
you think has a MORE POSITIVE preferential, 
evaluative, and/or affective ATTITUDE TO- 
WARDS THE UNDERLINED ENTITY. You 
are provided with an answer sheet for this pur- 
pose. For each statement pair, simply place a 
check mark in the appropriate space on the answer 
sheet, 


At this point, instructions were provided for the 
recording of judgments on the answer sheet. Exam- 
ples of the pairs of statements, which were randomly 
presented (both within and between pairs) in a 
booklet of 35 pages, are given in Table 1, with the 
immediate statement of each pair listed first. The 
assignment of immediate versus nonimmediate state- 
ments to Speakers A and B was counterbalanced, 


RESULTS 


The agreement frequency of each S for a given 
category ranged from zero (i.e., no agreement of 
S with the predictions made on the basis of im- 
mediacy criteria in all five instances of the cate- 
gory) to unity (i.e., agreement of S on all five 
instances of the category). According to the null 
hypothesis, the expected value of these agreement 
frequencies for a given category over all Ss (or 
all categories over all Ss) is .50, which corre- 
sponds to a .50 probability of successful predic- 
tion of Ss’ responses. Table 2 contains the mean 
agreement frequency values obtained for each 
category, over all Ss. It will be noted that mean 
agreement frequency values for all seven cate- 
gories exceed the chance frequency of .50 (p< 
.02, using a two-tailed sign test). Thus, for all 
immediacy categories considered together, re- 
sponses on the basis of immediacy criteria can 
be successfully predicted, Furthermore, the indi- 


vidual mean agreement frequency values of each 
of six immediacy categories (excluding the dura- 
tion category) are significantly different from .50 
(p <.01, using a two-tailed normal approxima- 
tion to the binomial distribution). 


Discussion 


For six of the seven immediacy categories in- 
vestigated, it was found that the more immediate 
statement in a pair is judged as indicating a more 
positive attitude. However, untrained Ss do not 
consistently interpret longer communications 
(duration category) as indicating more positive 
attitudes. One might speculate that the lack of 
agreement among Ss is due to the absence of im- 
plicit norms for communication length in the 
situations used in the above experiment. It would 
seem that some specification of the social context 
in which a communication takes place could 
further delimit for S the acceptable length of 
communications and thus make consistent judg- 
ments possible. 

In the above experiment, Ss were confronted 
with a forced-choice alternative of saying which 
statement expressed a more positive attitude. 
Similar findings have emerged when a more 
graduated form of judgment was obtained from 
Ss. The findings of the above experiment, to- 
gether with Mehrabian’s previous findings, sup- 
port the following conclusion: If contrasting de- 
grees of immediacy are made focal in a pair 0. 
statements, the more immediate statement of Be 
pair is judged as expressing a greater degree © 
liking, positive evaluation, or preference towar 
the object of communication. This conclusion 
holds for 11 of the 12 immediacy categories. d 

In considering the categories which yiel ts 
support for the hypothesis, it is interesting i 
note that a number of them subsume (cones 
cation phenomena which trained interviewer 
to infer communicator attitudes. The picha ae 
category includes verbal phenomena whic ee 
typically subsumed under the rubric of Ol a 
sional qualification and doubting. Increases 
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frequency of obsessional qualification and doubt- 
ing are taken as indicators of a negative com- 
municator attitude toward the contents with which 
the. increases are associated. Similarly, the order 
of occurrence category and its interpretation co- 
incide with clinicians’ assumptions that a client 
considers topics with which he feels most com- 
fortable at the beginning of an interview and 
avoids those which make him uncomfortable 
(e.g., a variant of resistance). 

It is appropriate at this point to explore the 
implications of the above findings for attitude 
inference, not only by the trained clinician, but 
also by the untrained addressee in everyday com- 
munication situations. In a large number of inter- 
personal communication situations information 
about the attitudes of a communicator may be 
helpful, but is not readily available in the ex- 
plicit verbal contents of his communications, In 
such situations, however, there are frequent op- 
portunities for an addressee to use immediacy 
variations to infer communicator attitudes, The 
results of the present study indicate that the use 
of immediacy variations for inferring attitudes 
requires base lines for assessing the significance 
of a given degree of immediacy. It seems unlikely 
that in everyday communication situations such 
base lines would be available to the addressee in 
the form of the direct contrasts in immediacy 
employed in the above experimental paradigm. 
However, it is quite likely that such base lines 
would be available to the addressee from the 
Situation in which a communication takes place, 
as in the following: Mike is in the process of 
showing his new house to John, and a mutual 
friend of theirs appears on the scene. John tells 
the friend, “Mike showed me his house” (in con- 
trast to saying, “Mike is showing me his house”), 
and so Mike does not continue to show his house 
to John, In this episode, the actual degree of 
temporal immediacy of John and the act of see- 
ing the house is high, whereas the immediacy in 
the communication is relatively low. Mike is as- 
Sumed to have interpreted John’s comment as in- 
dicating a lack of preference for “seeing the 
house” and discontinued the activity. Thus, more 
generally, relatively nonimmediate communica- 
tions about an actually immediate relationship are 
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assumed to lead to the inference of negative com- 
municator attitudes toward the objects being 
communicated about, if the information about the 
actual degree of immediacy is available to the 
addressee. 
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ORAL INVOLVEMENT IN PEPTIC ULCER 


HOWARD M. WOLOWITZ* 
University of Michigan 


Franz Alexander hypothesized that peptic ulceration is a frequent conse- 
quence of intense oral passivé needs frustrated by either defensive reaction 
formation on the part of the S or external privation. A self-report in- 
ventory of food preferences designed to measure Ss’ unconscious preferences 
for oral passive versus oral aggressive gratifications was administered to 20 
male patients hospitalized for peptic ulcer and 23 hospitalized psychiatric con- 
trols. The scores of the ulcer patients were significantly more oral passive 
(p<.005) than the controls, thereby lending support to Alexander’s hy- 
pothesis concerning the underlying oral passive involvement of persons prone 


to ulceration. 


This study is an attempt to subject a portion 
of Alexander’s (1950) psychoanalytic hypotheses 
concerning the role of orality in peptic ulcer 
patients to an experimental test. 

Alexander’s hypotheses may briefly be sum- 
marized as follows: The frustration of intense 
cravings for oral passive gratifications, either 
through an internal need to defend against them 
through reaction formation or through the ex- 
ternal lack of gratification, results in the chronic 
over-mobilization of certain physiological reac- 
tions anticipatory of oral gratifications. The re- 
sult of such over-arousal (e.g., hyperacidity of 
gastric secretions) is an increased probability 
of ulcertation in such persons. 

Rapaport (1960) has warned of the dangers 
inherent in attempts to test psychoanalytic 
propositions without paying due attention to 
other specified parameters of importance. In the 
present context, the postulated defensive needs 
of some persons against the open recognition 
of their oral passive cravings represents such a 
relevant parameter. 

Consequently, to test Alexander’s ulceration 
hypothesis, a self-report questionnaire was se- 
lected, the Food Preference Inventory (FPI), 
which was designed to index Ss’ oral involve- 
ments at an unconscious level. The FPI (Wolo- 
witz, 1964) uses a forced choice between foods 
characterized by oral passive gratifications (e.g., 
soft, sweet, liquid, etc.) and oral aggressive 
gratifications (e.g., hard, sour, dry, etc.). The S 
is asked to indicate his food preferences rather 
than his actual present food intake in order to 
circumvent the effects of such extraneous influ- 
ences as temporary diets, use of dentures, 


+The author wishes to thank Rosalie Ging and 
Robyn Dawes of the Ann Arbor Veterans Adminis- 
tration Hospital for generous permission to use their 
subjects. 


temporary appetite loss, availability of money, 
etc. 

Previous research (Wolowitz, 1964) has indi- 
cated that the FPI possesses sufficient reliability 
and construct validity to be used as an index 
of the intensity of Ss’ relative oral passive versus 
oral aggressive involvement. Briefly, that research 
demonstrated that male alcoholics have, as hy- 


TABLE 1 


VALUES OF CONTROL VARIABLES IN ULCER AND 
PSYCHİATRIC PATIENT GROUPS 


i iatric controls 
Variable Ulcer batgaty Piyeee fy ol 
Sex 100% Male 100% Male 
Religion 75% Protestant 57% Protestant 
25% Catholic 43% Catholic 
Education 9,9 mean grade 10.7 mean grade 
Birthplace 100% born in mid- | 100% born in midwest 
west USA USA 
Residence 100% reside in 100% reside in Michi- 
Michigan and gan and Ohio 
Ohio 
Occupation 100% semi- or un- 100% semi- or un- 
Sled blue collar | ‘skilled blue collar 
Age 46.8 mean 46,1 mean 
Marital status | 93% Married 79% Marries 
i 7% Divorce 
eres 14% Single 
Race 93% White 93% White 
1% Negro 7% Negro 
Diagnosis 100% Ulcer 14% Schizophrenic f 
14% Hypochondriasis 
14% Depressive re- 
action x 
14% Anxiety hysteria 
4% Neurological 
symptoms 
10% Conversion 
hysteria 
10% Anxiety reaction 
10% Vocational mal- 
adjustaieny ; 
0% Character dis- 
we order 
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pothesized, significantly higher oral passive pref- 
erences than normal controls and that there was 
a significant correlation with the Goldman-Eisler 
(1956) questionnaire measure of oral personality 
attributes for all Ss. 

On the basis of Alexander’s hypotheses, it was 
predicted that ulcer patients would have signifi- 
cantly more oral passive preferences on the FPI 
(i.e., higher scores) than a group of nonulcerative 
controls, 


METHOD 


At a Veterans Administration Hospital, 20 male 
patients diagnosed as having peptic ulcers were 
obtained as Ss, as well as 23 unselected male psychi- 
atric patients at the same hospital who were free 
of peptic ulceration. All Ss were given a 103-item 
FPI. This type of control group was selected in 
order to control for possible induced dependency 
involved in occupying the role of resident patient 
in a hospital. Previous research has demonstrated 
(Wolowitz, 1965) that occupancy of a dependent 
role or induction of a dependent self-concept effects 
the degree of oral passive versus oral aggressive 
involvement reflected in the FPI choices. 

Both groups of patients were alike in respect to 
age, education, religion, skin color, birth place, geo- 
graphical residence, marital status, occupational level, 
and sex. The values of these variables for both 
control and experimental groups are listed in Table 1 
as well as the distribution of diagnoses for the 
Psychiatric control group. None of the differences 
between the groups are significant. 


TABLE 2 


MEDIAN Tesr or THE FPI Score DIFFERENCES 
BETWEEN ULCER AND CONTROL SS 


F Psychiatric Ulcer 
ERI controls* patients? 
Above median 4 17 
(55 or higher) 
Below median 18 3 
(54 or less) 


Note—x? = 8.07, df =1, p <.005 (one-tailed); 4 = 4.3, 
sls P< 005 (one-tailed). 
bN T23, M = 50.7, SD = 7.4. 
=20, M = 60:1; SD = 6.3. 
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RESULTS 


The food preference scores of the 43 Ss were 
arranged in order from lowest (oral aggressive) 
to highest (oral passive), and the difference be- 
tween the groups was tested by the median test 
corrected for continuity by Yate’s correction 
with one case excluded at the median. In addi- 
tion, the significance of the difference between 
the means of the two groups was computed by a 
Student’s ¢ for small samples. Table 2 shows 
that the scores of the ulcer patients are signifi- 
cantly more oral passive than those of the con- 
trols. 


Discussion 


The results of this experiment lend support to 
Alexander’s hypothesis that peptic ulcer patients 
have intense oral passive needs, The scores of a 
group of male peptic ulcer patients on a self- 
report food preference questionnaire designed to 
reflect unconscious oral involvements were com- 
pared with a control group of unselected psychi- 
atric patients. The ulcer patients’ food pref- 
erence scores were significantly more oral passive 
than those of the psychiatric controls. 

It is possible that these differences are specific 
only to a comparison of ulcer and psychiatric 
patients—the latter group being marked by oral 
aggression rather than the ulcer group by oral 
passivity. Consequently, it would be desirable to 
employ other types of controls in future work, 
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REPORTED MATERNAL AND PATERNAL BEHAVIORS OF 
SOLITARY AND SOCIAL DELINQUENTS 


JOHN C. BRIGHAM, JAMES L. RICKETTS, ano RONALD C. JOHNSON 1 


University of Colorado 


Previous research suggests a different etiology for solitary as opposed to social 
delinquents, with solitary delinquents coming from sociologically “normal” 
and (presumably) psychologically disturbed homes, with the reverse gen- 
erally being the case for social delinquents. Other research demonstrates that 
delinquents, as a whole, view their fathers’ more than their mothers’ behaviors 
as being of a type that might be called pathogenic. When descriptions of 
parent behavior by solitary and social delinquents are compared, solitary 
delinquents report more deviant maternal behavior, It was concluded that 
solitary and social delinquents differ in the etiology of their behavior difficulties, 
with both varieties having disturbed relations with male authority figures, 
but only the solitary delinquents having disturbed relations with female 


authority figures. 


Following the rationale of Lindesmith and 
Dunham (1941), Randolph, Richardson, and 
Johnson (1961) compared solitary (all offenses 
committed alone) with social (all offenses com- 
mitted in the company of others) male juvenile 
delinquents. As would be predicted from Linde- 
smith and Dunham’s work, most solitary delin- 
quents were found to come from ostensibly nor- 
mal middle-class homes, but to show a rather 
extreme degree of psychological disturbance as 
measured by the MMPI, while social delinquents 
were generally fom lower-class homes located in 
high delinquency areas, with few evidences of 
psychological pathology. These data indicate that 
the solitary delinquent typically is a psychologi- 
cally disturbed individual, while the social de- 
linquent is a far more normal individual, in a 
psychological sense, who comes from a crimo- 
genic social environment. As compared with so- 
cial delinquents, solitary delinquents, despite 
their cultural advantages, are more often recidi- 
vists (Johnson, 1950), presumably because their 
delinquency is a symptom of a deeper, more 
underlying disorder. It would seem reasonable to 
believe that various treatment techniques might 
differ in effectiveness for solitary, as opposed to 
social, delinquents. It is, therefore, of some prac- 
tical as well as theoretical value to determine 
more of the distinguishing characteristics between 
these two groups. 

Medinnus (1965) tested delinquents and con- 
trol Ss, using the Roe and Siegelman (1963) 
Parent-Child Relations Questionnaire (PCR), in 
which each S describes his father’s and mother’s 


1 The authors would like to express their gratitude 
to Allen Childers of the Federal Correctional In- 
stitution, Englewood, Colorado. 


characteristic behaviors toward himself. Medin- 
nus found that delinquents and controls did dif- 
fer in the way that they viewed their mothers’ 
behaviors, with delinquents viewing their moth- 
ers’ behaviors as being more on the psychologi- 
cally deleterious end of behavior continua than 
controls. However, delinquents differed far more 
from controls in reported paternal behavior, view- 
ing their fathers as being much more neglecting, 
demanding, rejecting, and punishing and Jess lov- 
ing and protecting. Medinnus did not categorize 
Ss within his delinquent group in any way. In 
light of the present writers’ interest in establish- 
ing differences in etiological factors between soli- 
tary and social delinquents, the present study may 
be considered to be, in part, a replication of 
Medinnus’ work, but with the chief concern be- 
ing the measurement of differences in reported 
parent behaviors between solitary and social de- 
linquents rather than between delinquents and 
matched controls. 


METHOD 
Subjects 


From a total population of 350 inmates of the 


Federal Correctional Institution, Englewood, coe 
rado, 59 Ss were selected. These Ss were selec a 
randomly from within that segment of the ne 
population that was Caucasian, judged by insti 5 
tional administrators to be sufficiently literate in 
respond to a questionnaire, and not taking Pii st 
the daytime work release program. Three Ss aie 
understand or did not wish to follow the 4% S 
structions, and nine Ss could not be categorizi leav- 
either the solitary or social delinquent scone i 
ing a total of 47 Ss whose responses were Sa da 
the present study. These 47 delinquent Ss Tool 
mean age of 17.6 (range, 15-20), and a mean 
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102.5 (range, 83-121). Solitary and social delinquents 
did not differ significantly in age or IQ, though the 
difference between solitary (mean IQ, 104) and so- 
cial (mean IQ, 101) delinquent IQs approached sig- 
nificance. The solitary, as opposed to social, delin- 
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quents were separated from one another on the basis 
of three questions interspersed among the other items 
of a much larger questionnaire (the remainder of 
which is not relevant to the present study). These 
questions were: 


When you’ve broken the law, were you usually alone, or did you break the law with others? 


1. Always alone 2. Alone most 3. Times alone 4. With others 5. Always with 
of the time; and times most of the others 
sometimes with others time; some- 
with others about equal times alone 

When you break the law, do you usually do it: 

1. Alone 2. With one 3. With two or 4. With four or 
other person three other more other 

persons persons 


When you’ve broken the law, how many times have you broken it ALONE? 


1. Never or 
almost never 


2. ł of the time 


The Ss who checked one of the first two choices 
in the first question, the first choice in the second 
question, and one of the last two choices in the 
third question were considered to be solitary de- 
linquents. The Ss who checked one of the last two 
choices in the first question, any but the first choice 
in the second question, and one of the first two 
choices in the third question were considered to be 
social delinquents. As noted above, nine Ss were not 
consistent in their responses and were discarded. 
Twenty-two Ss fell into the solitary category, 
while 25 were judged to be social delinquents. 


Measuring Devices and Procedure 


a The PCR consists of a “mother” form and a 

father” form, each of which contains 130 items, 
forming 10 scales. Six of the scales have 15 items 
each and were developed to tap occurrences of be- 
haviors perceived as being protecting, rejecting, 
casual, demanding, loving, and neglecting, Four ad- 
ditional categories, containing 10 items apiece, are 
devoted to the assessment of two punishing and two 
rewarding types of parental behavior, Symbolic-love 
Teward and symbolic-love punishment denote the 
use of methods such as praise for approved behavior, 
Special attention, and demonstrative affection as re- 
wards ; shaming, isolating, and withdrawal of love as 
Punishments. Direct-object reward and direct-ob- 
Ject punishment denote the use of gifts, money, and 
the reduction of chores as rewards; physical punish- 
ment, taking away playthings, and denying promised 
rewards as punishments. 

Medinnus (1963) was successful in using the PCR 
pits Juvenile delinquents. Nevertheless, the present 
Writers felt that the vocabulary level of some of the 
items could be reduced, making the items more clear 
Without altering the meaning. The writers did revise 
by eae hopefully making them more easily under- 

2 s ts e 
Sandable2 The revised PCR for each parent; an 


*A Copy of the altered Roe-Siegelman Parent- 
Child Relations Questionnaire has been deposited 


3. 4 of the time 


4. į of the time 5. Always or al- 


most always 


other questionnaire having to do with offenses and 
with adjustment to, and attitudes toward, the insti- 
tution; and several other instruments not relevant to 
the present study were administered to the delinquent 
Ss in groups ranging in number from 7 to 12. Ap- 
proximately half of the Ss in each group responded 
to the PCR concerning mothers first, while they re- 
ceived the PCR relating to fathers’ behavior last of 
all measures. The reverse was true for the other 
half of the Ss. 


RESULTS AND DISCUSSION 

The responses of the solitary and social delin- 
quents to the mother and the father PCR scales 
are shown in Table 1. 

Solitary delinquents differ from social delin- 
quents in viewing their mothers as using more 
symbolic-love punishments and as being more 
rejecting and neglecting and less loving, Solitary 
delinquents report their fathers to be more ne- 
glecting than do social delinquents.? 


with the American Documentation Institute. Order 
Document No. 9486 from ADI Auxiliary Publica- 
tions Project, Photoduplication Service, Library of 
Congress, Washington, D. C. 20540. Remit in ad- 
vance $1.75 for microfilm or $2.50 for photocopies, 
and make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 

3A group of 35 male college Ss also was tested, 
using the revised PCR scales. While this group was 
not an adequate control group, it should be men- 
tioned that the college Ss did not differ significantly 
from the social delinquents on any of the 10 ma- 
ternal scales, while the solitary delinquents and col- 
lege Ss differed on 5 of the 10 maternal scales. So- 
cial delinquents differed significantly from college Ss 
on 4 of the 10 paternal scales, while solitary de- 
linquents differed significantly on 5 of the 10 pa- 
ternal scales. These data support the position taken 
herein—that only some varieties of delinquents re- 
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TABLE 1 


MEAN PCR DIMENSION Scores For SOLITARY 
AND SOCIAL DELINQUENTS 


Delinquent FRR 
3 fi Significance 
PCR dimension |——— of difference 
Solitary | Social? 
Mother 
Protecting 38.95 | 41.00 ns 
Symbolic-love punish- 
ment 29.00 | 25.44 .05 
Rejecting 33.82 24.72 .01 
Casual 45.59 | 49.44 ns 
Symbolic-love reward | 32.23 | 36.04 ns 
Demanding 44.14 | 42.60 ns 
Direct-object punish- 
ment 29.55 25.64 ns 
Loving 52.91 | 60.92 .05 
Neglecting 31.55 | 24.96 05 
Direct-object reward 29.55 | 31.72 ns 
Father 
Protecting 37.14 | 36.52 ns 
Symbolic-love punish- 
ment 29.68 | 29.52 ns 
Rejecting 41.86 | 37.92 ns 
Casual 42.82 | 44.04 ns 
Symboliclovereward | 29.05 | 28.80 ns 
Demanding 51.50 | 49.56 ns 
Direct-object punish- 
ment 29,82 | 29.36 ns 
Loving 42.32 | 47.36 ns 
Neglecting 40.23 | 33.32 .05 
Direct-object reward 23.27 25.32 ns 


Note.—PCR scoring was done according to the Roe-Siegel- 
mae method in which each response has a possible score of 1-5. 
aN = 


bN =25. 


Notes AND COMMENTS 


The present results suggest that disturbed 
mother-son relations are present in the solitary 
far more than in the social delinquent group, 
once delinquents have been differentiated into 
these two categories. Like previous studies (John- 
son, 1950; Randolph et al., 1961) the present 
study suggests a somewhat different—and prob- 
ably more pathological—etiology for solitary as 
compared with social delinquency, at least with 
regard to institutionalized delinquents. It may 
be, of course, that the solitary delinquent, more 
often from middle-class surroundings, has to be 
either quite severely delinquent or have a very 
disturbed and rejecting home environment to be 
institutionalized at all. Thus, a different kind of 
selection might occur for the two groups, and 
obtained differences would hold only for those 
solitary and social delinquents who are institu- 
tionalized. 


port disturbed relations with their mothers—and are 
in accord with Medinnus’ findings that delinquents, 
regardless of type, differ from nondelinquents in 
reported paternal behavior. 
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VERBALIZATION, STIMULUS RELEVANCE, AND PERSONALITY CHANGE 


MILTON F. SHORE 
National Institute of Mental Health 


Verbal productivity in thematic stories 


AND 


JOSEPH L. MASSIMO 1 
Judge Baker Guidance Center 


given over a 10-mo. period by 3 


groups—successfully treated adolescent delinquent boys, untreated delinquents, 
and nondelinquents—revealed that the number of words increased significantly 
only for the treated delinquents and only in 1 of the areas chosen for stimulus 


relevance, 


that of control of aggression. The productivity in stories to self- 


image and attitude toward authority showed no change. These results were 
consistent both with theory and the treatment goals. They suggest that verbal- 
ization may serve as 1 mechanism through which control over overt hostile 


behavior may be obtained. 


In a previous publication in this journal, the 
authors (Shore, Massimo, & Mack, 1965) ana- 
lyzed the changes in the perception of interper- 
sonal relationships in successfully treated ado- 
lescent delinquent boys in relation to changes in 
academic performance and overt behavior. The 
mechanisms by which the changes may have oc- 
curred were not clear, Further analysis and the 
collection of additional data have added to the 
understanding of some possible mechanisms by 
which the successful psychotherapy may have 
occurred. 

Although many developmental psychologists 
and a large number of psychotherapists have 
Commented on the role verbalization might play 
in the evolution of control over overt behavior 
(so that symbolization and vicariousness substi- 
tute for direct experience), no systematic studies 
have been reported in which changes in verbali- 
zation resulting from psychotherapy with acting- 
Out individuals were evaluated. 

This study investigated the changes in the de- 
Stee of verbalization resulting from a successful 
Psychotherapeutic program for adolescent delin- 
quents, The questions asked were: 

1. Is there a change in the degree of verbali- 
zation as a result of successful psychotherapy 
With delinquents (where overt hostile behavior 
has significantly decreased) ? 

2. If there is a change, is this a generalized 
change (perhaps due to increased cooperation) 
OY is it specific to certain stimulating conditions? 


PROCEDURE 


The rationale, experimental design, tests adminis- 
fred, and the nature of the psychotherapeutic pro- 


h `The authors wish to thank Janet K. Moran for 
data S20ce with the statistical analysis of the 
a. 


The Second-named author is now at the Newton 
Ublic Schools, Newton, Massachusetts. 


gram have all been described in a Previous publica- 
tion (Shore et al., 1965). It should be noted that 
the specific goal of the treatment Program was to 
reduce the antisocial behavior by helping the boys 
become aware of the consequences of their hostile 
behavior, by suggesting more socialized avenues for 
the expression of frustration and rage when they 
arose, and by offering many opportunities for suc- 
cess and mastery. 

Additional data were added to those obtained 
from the treated and untreated delinquent groups 
over the 10-month period. Fifteen boys of the same 
age, IQ, and socioeconomic level, having no known 
history of antisocial behavior, were administered 
the tests twice over a 10-month period. This addi- 
tional control group was selected in order to gain 
increased understanding of the nature of the changes 
that occurred in the treated and untreated groups. 

The open-ended nature of the task of telling 
thematic stories permitted the boys to give as long 
or as short a story as they chose in relation to the 
picture. Word counts were made on all the thematic 
stories given to 15 pictures in the three areas of self- 
image, attitude toward authority, and control of 
aggression, including the answers to questions asking 
for clarification of the story when appropriate. (An 
analysis of the number of questions asked by the 
examiner revealed no differences in the numbers of 
questions asked in each group.) 


RESULTS 


There was no significant difference between the 
three groups in the number of words given to 
the stories on first testing. However, over the 
10-month period there was a significant change 
in verbalization in the groups. Table 1 shows the 
F tests for the analysis of variance for the three 
groups over the two conditions. One F ‘reached 
significance. Under the conditions in which hos- 
tile feelings are believed to be aroused by the 
stimuli, the groups showed marked changes in 
their use of words. The treated group rose in the 
number of words used; the untreated group 
dropped; the nondelinquent group (as might be 
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TABLE 1 


F TESTS OF ANALYSIS OF VARIANCE FOR THE 
Worp Count oF THREE GROUPS IN 
Two CONDITIONS 


F tests 
Variance df 
Self- | Control of R 
Image | aggression authority 
Groups 2 1.37 1.44 1.17 
Error 32 
Conditions 1 11.38 3.62 2.39 
1.66 4,08* 1.69 


Groups X Conditions | 2 
Error 32 


*p <.05. 


expected) showed no change. The change was 
not a general one (such as might result from in- 
creased cooperation, marked blocking, or resist- 
ance), but was specific to the nature of the stim- 
ulating condition. 


DISCUSSION 


The treatment process in the delinquent boys 
consisted primarily of attempts to get the boys 
to realize the potentially self-destructive conse- 
quences of their hostile activity and to develop 
some behavioral controls. It has been shown that 
this was accomplished (Massimo & Shore, 1963) 
and that this change may have reflected itself in 
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the area of verbal productivity. Under conditions 
in which hostile feelings and their control were 
stimulated, the treated group appeared to use 
words to deal with the situation. The fact that 
the number of words after 10 months was some- 
what greater in the treated group than in the 
nondelinquent group suggests that the treated 
group may still be experiencing the arousal of 
intense hostile motives and may find it necessary 
to overreact verbally in order to ensure that 
there is no breakthrough of these feelings into 
overt hostile behavior. This might be analagous 
to the young child who, in the development of 
behavioral controls, often will verbalize prohibi- 
tions in an effort to stop himself, a stage which 
seems to be preliminary to the internalization of 
an adequate system of regulation over his be- 
havior. 
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PATERNAL ABSENCE AND OVERSEAS SUCCESS OF 
PEACE CORPS VOLUNTEERS 


PETER SUEDFELD 1 
Rutgers—The State University 


Paternal absence during childhood differentiated significantly between successful 
and unsuccessful Peace Corps volunteers. In 2 independent samples the 
proportion of individuals from fatherless homes was significantly gteater among 


unsuccessful volunteers. 


The recent furor over the so-called Moynihan 
(1965) report may serve to direct attention to 
the role of the father in the development of his 


+The data were collected while the author was 
Expert Consultant to the Field Selection Branch, 
Division of Selection, Peace Corps. A preliminary 
report of these findings has appeared in the Peace 
Corps Field Selection Newsletter. I am grateful for 
the help and support of R. B. Voas, B. O. Baker, 
and the other personnel of the Field Selection 
Branch. 


children, This factor has been greatly neglected 
in research on human developmental psychology 
(see, e.g., Baldwin, 1959) and, particularly, ™ 
the consideration of antecedent influences 0n 
adult behavior. ‘br 
Pasley (1955) published a book containing 
brief biographies of the 21 Americans and 1 
Briton who elected to stay in Communist China 
after the Korean war. As Pasley pointed out, 
of the 21 Americans 11 had “lost their fathers 
at an early age, through divorce or death [p. 
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223].” Of the 21, 19 “felt unloved or unwanted 
by fathers or stepfathers [p. 222).” This 
finding provided the impetus for the research 
described below. 


PROCEDURE 


In the course of a study on predictors of overseas 
success among Peace Corps volunteers, data were 
collected concerning the presence of the biological 
father during the volunteer’s childhood. For the pur- 
poses of the study, “absent father” was defined as 
one who had been absent from the child’s residence, 
for whatever reason, during at least 5 years before 
the future volunteer’s fifteenth birthday. 

Two samples of volunteers’ files were drawn ran- 
domly. Thirty-five Ss were serving successfully 
Overseas; 34 others had returned to the United 
States before the scheduled completion of their 
Overseas tours because of problems of adjustment 
and conduct (including psychiatric terminations). 
Still-serving volunteers were used as controls" to 
avoid the possibility that Ss who volunteered for 
the Peace Corps over 2 years ago were systematically 
different from more recent enrollees. While it is true 
that some members of the control sample may still 
return early, it is relatively unlikely since most of 
them were near the end of their scheduled service. 


RESULTS AND DISCUSSION 


Paternal absence differentiated extremely -well 
between stayers and terminators: 9% of the 
former and 44% of the latter came from 
absent-father backgrounds (x? = 10.828, df= 1, 
= .001). This clear-cut finding was so un- 
expected that two more samples were drawn in 
the same way, one consisting of 27 stayers and 
the other of 28 early terminators. This time, 
14% of the stayers and again 44% of the 
teturners had had absent fathers (x? = 9.436, 
df=1, p <.005). 

It should be noted that the measures used 
Were quite crude. For the absent-father category, 
Such factors as age at separation, reason for 
Paternal absence, “psychological absence” of a 
Physically available father, and relationship to a 
Stepfather or father surrogate were not consid- 
fred. In applying the stay-return criterion, no al- 
lowances were made for subjects’ job assignments, 
locations, and specific events leading to separa- 
tion from the Peace Corps. Refining these cri- 
teria should increase predictive power even more, 
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and may pinpoint some exact consequences of 
various father-child relationships, 

As for alternate explanations, the racial-socio- 
economic factor which has been so closely iden- 
tified with fatherless families (e.g., in both the 
Moynihan and the Pasley publications) was not 
important here—almost 100% of the subjects 
were college graduates, overwhelmingly from 
white middle class families. Neither any 
other family variable (absent mother, family 
size, birth order, sex of volunteer) nor the 
interaction effect was able to discriminate be- 
tween the groups. The only other difference of 
even tentative value was that among 29 early 
terminators, firstborn volunteers tended to re- 
main overseas somewhat longer before returning 
than did later borns (Mdn=10.0 versus 7.1 
months, U = 54, p<.05, one-tailed). (The use 
of a one-tailed test was based on a report by 
Dohrenwend and Dohrenwend, 1966, that when 
social interaction is possible—as in the Peace 
Corps—firstborn subjects can endure real-life 
stress better than later borns.) 

In view of the great power of the absent- 
father variable, it seems important to follow up 
these findings in both naturalistic and laboratory 
settings, Were the differences due to reduced 
stress tolerance as a result of paternal absence, 
to greater dependence, or to less flexibility in 
unusual situations? The answer to such questions 
appears to be of theoretical, as well as practical, 
significance. With further research, it may turn 
out that paternal absence has consequences as 
important as those of that other recently 
exhumed factor, birth order. 
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RELATION OF SUCKING STRENGTH TO PERSONALITY VARIABLES 


PAUL BERGMAN,! CHARLOTTE MALASKY, anp THEODORE P. ZAHN 
National Institute of Mental Health 


An attempt was made to demonstrate a relationship between sucking strength 
in normal adults and personality variables. Tests of sucking strength, control 
measures of manual strength and vital capacity, and MMPI questionnaires 
were given to 44 men and 33 women. Men and women both had significant 
negative correlations between sucking and Welsh’s R dimension, after age 
correction. Males also showed positive correlations between sucking and the 
physical tests of manual strength and vital capacity. For women, however, 
sucking was related to several personality variables but not to the physical 
tests, This sex difference in correlational pattern suggests that differences in 
sucking strength are determined more by personality variables in women and 
more by physical strength variables in men. If it is assumed that repression 
(as indicated by Welsh R) may stem from unconscious conflicts about oral 
needs, then the data are compatible with the idea that the sucking tests reflect 


similar conflicts. 


A series of studies on various aspects of oral 
functioning was undertaken in an attempt to 
develop objective behavioral measures applicable 
to the investigation of psychoanalytic concep- 
tions of personality functioning. The general hy- 
pothesis of these studies was that persons with 
severe personality disorders, which according to 
psychoanalytic theory are attributable to some 
trauma or disruption in the early (oral) stage 
of personality development, should show some 
residual weakness or disorganization in oral 
functioning. An investigation of oral functioning 
as related to psychopathology showed that, of a 
number of oral functions measured, an impair- 
ment in sucking performance was most charac- 
teristic of pathological groups, particularly in 
women patients. Further, a developmental study 
of sucking (Bergman, Malasky, & Zahn, 1966) 
showed that normals vary much more in sucking 
strength than in manual strength within age 
groups. It was suggested that differences in atti- 
tude toward the sucking tasks, which apparently 
were not adopted by the schizophrenics to the 
manual strength task, might partially explain the 
greater variability in sucking. In an effort to 
demonstrate more directly a relationship between 
sucking and personality variables, we attempted 
in the present study to relate sucking perform- 
ance of normal men and women to personality 
characteristics, as measured by the MMPI. 


‘Paul Bergman’s untimely death prevented him 
from carrying this project to its conclusion. He de- 
vised the oral tests and supervised the data collec- 
tion and most of the data analysis. 

2 In preparation. 


METHOD 
Subjects 


The Ss were 44 males (age 18-75) and 33 females 
(age 18-64) who were normal volunteers at the 
National Institutes of Health or relatives of schizo- 
phrenic patients. All Ss were white and middle class, 
and none showed severe psychopathology or any 
incapacitating physical illness. 


Tests 


In addition to two tests of sucking, measures of 
manual strength and “vital capacity” were obtained 
for control purposes. Each S was also given an 
MMPI to complete at his leisure. The tests were 
as follows: 

Single suck. A tygon hose 904 centimeters long, 
1 centimeter inner diameter, was attached to a 
vacuum meter which measured negative pressure in 
terms of millimeters of mercury. The S was told 
how the hand of the vacuum meter would respond 
when he sucked out air from the open end of the 
hose. The S’s task consisted of taking one suck, 
as strong as possible. This was repeated four times. 
The highest reading of the four trials was used as 
the score on this test. 4 

Multiple sucking. The S used the same equipment 
as in the previous test, but was instructed to make 
a continuous sucking effort to drive the score as 
high as possible. If he found it difficult to do s0, 
the experimenter gave advice or demonstrated how 
he would do it. He suggested, for instance, the 
holding of the tongue against the end of the hose 
to preserve the achieved vacuum, the putting of 
the tongue in a different part of the mouth, pict 
This was done to encourage and facilitate the 5’s 
achieving the highest level of functioning of wine 
he was capable. The highest of four trials was use 
as the score. 
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NOTES AND COMMENTS 


TABLE 1 


PRODUCT-MOMENT CORRELATIONS AND PARTIAL COR- 
RELATIONS BETWEEN THE EXPERIMENTAL 
MEASURES AND MMPI VARIABLES 


FOR MEN 
Variable Correlation Partial 
correlation® 
Sucking score 
MMPIR —.48** =e 
MMPI Hy -30* 26 
MMPI Ma 2" -18 
Manual strength 58** AS 
Vital capacity Hater AT 
Age —.39** 
Vital capacity 
MMPI R —.23 —33* 
MMPI Hs —.36* 01 
MMPI Ma AT* .26 
MMPI Mf .30* —32* 
MMPI D =.32* =.21 
Manual strength .70** .56** 
Age = 6% 
Manual strength 
MMPI R —.19 —.20 
MMPI Hs —.32* mall 
MMPI D —.33* —.24 
Age —.52** 
Note.—N = 44, 
a Age partialed out. 
*p <05. 
** p <01, 


Manual strength. Hand pressure on a hand dyna- 
mometer was measured. The instructions were to 
Squeeze as hard as possible, once using the pre- 
ferred hand and once using both hands. As the 
two measurements proved highly correlated (r = .91), 
only the score achieved with both hands was used. 

Vital capacity. The S was asked to fill his lungs 
With air and then to blow the air through a mouth- 
piece into a “vital capacity apparatus” which regis- 
tered the amount of air expired. Three trials were 
&lven; the highest score was used. 


RESULTS AND DISCUSSION 


Correlation matrices for males and females 
Were computed separately. In addition to the 
four tests described above, the variables included 
the “sucking score,” which is a combination of 
the single- and multiple-suck scores? age, the 


*The sucking score for each S equals the sum of 
twice the single-suck score plus the multiple-suck 
Score. This particular weighting was used in order to 
give about equal power to each of the sucking tests, 


Since the mean multiple-suck score (for normals) 
Was about twice the mean single-suck score. 
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regular MMPI scales,* and three special MMPI 
scales: Welsh’s A and R and Barron’s Ego 
Strength. As the sucking measures were highly 
intercorrelated for both men and women, the 
sucking score was, for economy, considered as 
representative of sucking performance. Tables 
1 and 2 present for men and women, respec- 
tively, all the significant correlations between 
MMPI variables and the physical test scores 
plus some additional correlations for the sake 
of comparison. Partial correlations to remove 
the effect of age are also shown. 

For men, before partialing out age, sucking 
score correlated with several personality varia- 
bles as well as with age, manual strength, and 
vital capacity. However, with age partialed out 
the only MMPI variable to correlate signifi- 
cantly with sucking was Welsh’s R (r= —.48, 
p<.01). Manual strength and vital capacity 
both correlated positively with sucking, below 
the .01 level. A similar picture held for vital 
capacity as well, which, after partialing out 


4 These scores were K-corrected where appropriate. 


TABLE 2 


Propuct-MoMENT CORRELATIONS AND PARTIAL Cor- 
RELATIONS BETWEEN THE EXPERIMENTAL 
Measures AND MMPI VARIABLES 


FOR WOMEN 
i ç Partial 
Variable Correlation Re ae 
Sucking score 
MMPI R =.51** —.39* 
MMPI Ma .62** 48" 
MMPI Hs —43* —.26 
MMPI K =.52" —.50** 
MMPI L —.37* =31 
MMPI A 31" hs 
Manual strength 30 -09 
Vital capacity 23 16 
Age —.53** 
Vital capacity 
MMPI R =119 .07 
MMPI D —.39* —.27 
MMPI Hs —.39* ma 
Manualstrength a .55 
Age = 
Manualstrength 
MMPI R —.22 —.06 
MMPI Hy — 47" —.33 
MMPI Hs —.40* —.25 
Age — 43* 
—N = 33. 
Nee partaled out. 
*p<.05. 
** p< 01. 
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age, correlated positively with manual strength 
(p < .01) and negatively with Welsh R (p < .05) 
and Mf (p< .05). Manual strength was not re- 
lated to any personality variable after partial 
correlation. 

For women, Table 2 shows that sucking score 
correlated negatively with age, as with men, but 
correlated neither with manual strength nor 
vital capacity. However, it was related to more 
personality variables than in the case of men. 
With age partialed out, sucking score correlated 
negatively with K and Welsh’s R (p< .01) and 
positively with Ma (p<.01) and Welsh’s A 
(p<.05). In contrast, neither vital capacity 
nor manual strength correlated with any per- 
sonality variables after age correction, and vital 
capacity, unlike sucking, showed a high positive 
correlation with manual strength (p < .01). 

The data show, then, that of the personality 
variables measured only Welsh’s R dimension 
correlated significantly with sucking strength for 
both men and women. Although it is perhaps 
overstating the case to equate R directly to 
repression, Welsh’s analysis of the scale (Welsh, 
1956) suggests that persons who score high on 
it are socially introverted and unaggressive, lack 
energy, dislike excitement, and deny strong affect 
and “id-like” impulses, 

In view of our previous finding of impaired 
sucking strength in females with various types 
of psychopathology, it is of interest that females 
with psychiatric disorders also score high on the 
R dimension, In Welsh’s survey of 30 groups of 
subjects scored on R, the three highest group 
means are for the only three groups of female 
psychiatric patients on the list (Welsh, 1956). 
These independent findings, that female psychi- 
atric patients have high R scores and impaired 
sucking ability, provide an indirect confirmation 
of the negative relationship between sucking 
strength and R found in the present study. 

If it is assumed that repression may stem 
from unconscious conflicts about oral needs, then 
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the data are compatible with the idea that suck- 
ing tests reflect similar conflicts. On the other 
hand, alternative explanations are possible. One 
of these is that persons scoring high on R may 
be embarrassed by the task. This is consistent 
with the hypothesis advanced in an earlier paper 
(Bergman et al., 1966) that the greater variance 
in sucking strength as opposed to manual 
strength might be due partially to differences in 
subject’s attitudes toward the test. However, we 
have no direct observational evidence to support 
the embarrassment hypothesis, and, in any case, 
this hypothesis leaves open two questions: first, 
why subjects who are embarrassed by a task 
should do poorly on it; second, whether uncon- 
scious conflicts about oral needs can be con- 
sidered to be the source of the embarrassment. 

The different patterns of correlations with 
sucking strength in men and women suggest that 
differences in sucking strength are determined 
more by personality variables (attitudes?) in 
women and more by physical strength variables 
in men. This is consistent with the previous 
finding that psychopathology has a greater effect 
on sucking strength in women. 

The data presented here indicate that objective 
measures of oral functioning may be promising 
methodological tools in the investigation of per- 
sonality dynamics. Further work is needed to 
refine the oral measures and to provide a more 
rigorous assessment of the personality dimensions 
possibly related to them. 
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BRIEF REPORTS 


PSYCHOTHERAPEUTIC PERSISTENCE 4 


WIRT M. 


WOLFF 


Dallas, Texas 


Persistence of clients in psychotherapy can 
be the most necessary, yet insufficient, condition 
for psychotherapeutic effectiveness. Psychothera- 
peutic persistence, defined as number of inter- 
views, can range from cancellation of an initial 
appointment to continuation far beyond 1,000 
interviews. Such extremes are rarely dealt with 
in research or practice, but there is need for 
ways of estimating degree of client persistence 
or “stayability.” The present report considers 
intake MMPI scores as a means of predicting 
client persistence in psychotherapy. 

The Ss were selected from the 90 females and 
100 males who terminated psychotherapy within 
a 3-year period from a group of practicing psy- 
chologists. The Ss selected were all clients with 
an intake MMPI, 24 females and 33 males. 
Clients presented various characterological and 
neurotic patterns and were typically seen weekly 
for individual interviews on personal-family 
difficulties, The Ss were mostly in their 20s 
to 40s and were generally effective people of 
above average educational and socioeconomic 
Status. Proportionately more females were mar- 
tied (two-thirds) than males (one-third). Inter- 
Views ranged from 1 through 145 (M = 21.1) 
for females, and from 1 through 105 (M = 17.8) 
for males. The sexes had equivalent percentages 
of MMPI primary-scale elevations (T > 70) and 
Very similarly shaped mean profiles (p= .98; 
b< 01). The Ss seemed representative of our 
Private clientele except that Ss, more often than 
non-MMPI clients, persevered beyond an intake 
interview (x?= 29.4; p<.001). Thus, results 
Predominantly apply to private clients who 


*An extended report of this study may be ob- 
faned without charge from Wirt M. Wolff, 5925 
Forest Lane, Dallas, Texas 75230, or for a fee from 

American Documentation Institute. Order Docu- 
ment No. 9485 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington, D. C. 20540, Remit in advance 
$1.25 for microfilm or $1.25 for photocopies, and 
a checks payable to: Chief, Photoduplication 
Service, Library of Congress. 


showed more than minimal persistence, that is, 
continued beyond an initial interview. Results 
are based on Pearson r’s between MMPI raw 
scores with K corrections and number of 
interviews. 

Results for females revealed significant cor- 
relations between persistence and scales D and 
Pt around .40, and a correlation with K of —.40. 
The multiple r was .77 (p<.001). The raw- 
score equation for persistence was: 1.377D + 
2.189 Pt — 3.767 K — 30.559; ost = 18.5. Cross- 
validation of these findings is planned. Thus, the 
more depressed and anxious and less defensive 
female Ss were prone to persevere in therapy. 
Results are consistent with clinical theory indi- 
cating clients with more depression, anxiety, 
ruminativeness, passive dependency, and less 
ego resourcefulness and psychological defen- 
siveness are apt to show more psychotherapeutic 
persistence. 

For males, there were no significant r’s, That 
MMPI scales did not predict persistence for 
males, whereas they did for females, suggested 
that there were different factors between sexes 
related to duration in therapy when interviews 
average around 20, Also there was a preponder- 
ance of positive though insignificant r’s for both 
sexes for clinical scales with persistence, sug- 
gesting that the more generally disturbed Ss 
remained longer in psychotherapy. 

This report would have been more complete 
had administration of the MMPI been a uniform 
intake process for all clients, thus providing 
data on single-interview clients. It may be that 
these are the chronically ambivalent shoppers 
for psychological aid who elicit concern among 
psychotherapists, but objectively it is unknown 
how they compare with persisting clients. Clinical 
theory about psychotherapeutic persistence, as 
herein delimited, has been verified for a private- 
practice setting, and the multiple-regression 
equation for females holds promise for further 
research. 


(Received August 29, 1966) 
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CHANGES IN PATIENT INCOME CONCOMITANT 
WITH PSYCHOTHERAPY * 


BERNARD F. RIESS 
Postgraduate Center for Mental Health, New York City 


It has repeatedly been observed by individual 
therapists, as reported in casual conversations, 
that patients increase their earnings from work 
during the course of treatment. The amount of 
change seems to be greater than would be 
expected of a person not in treatment but work- 
ing in the same occupational area, An initial 
search of the literature reveals, however, no 
published data on enough cases to warrant the 
drawing of any conclusions. We present here the 
first results of a survey of income changes con- 
comitant with therapy in a relatively large 
population of patients in a big metropolitan ana- 
lytic psychotherapeutic clinic. The actual popula- 
tion for which data are presented here numbers 
414, 


METHOD 


Each therapist was asked to get information from 
his full-time working patients (exclusive of civil 
service employees) on (a) weekly income prior to 
starting present treatment, (b) present weekly in- 
come, and (c) when the change in income status, 
if any, took place. For each of these patients we 
obtained additional data on age, sex, number of 
treatment sessions, and type of employment. One 
control involved the selection of a group of 145 
patients who began treatment at the time of the 


1An extended report of this study may be ob- 
tained without charge from Bernard F. Riess, Direc- 
tor of Research, Postgraduate Center for Mental 
Health, 124 East 28th Street, New York City, N. Y. 
10016, or for a fee from the American Documenta- 
tion Institute. Order Document No. 9484 from 
ADI Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington, D. C. 
20540. Remit in advance $1.25 for microfilm or 
$1.25 for photocopies, and make checks payable to: 
Chief, Photoduplication Service, Library of Congress. 
This study was supported by a grant from the New 
York State Department of Mental Hygiene, Division 
of Community Services. 


completion of the present study, matched with the 
study group for occupational category. The wages 
of this group were then compared to those of the 
study patients. 


RESULTS 


1. The mean income before treatment of pa- 
tients now in treatment was $83 per week. After 
a mean of 57 therapeutic sessions, the weekly 
salary of the 414 subjects was $112 per week. 

2. The mean gain of $29 per week considered 
as a concomitant of therapy can be somewhat 
validated by comparing the present salaries of 
patients currently in treatment at the Center 
with those of an occupationally similar group of 
newly started therapeutic subjects. The com- 
parison shows that the in-therapy group is today 
economically better off by $22 per week of in- 
come than people in similar occupations who are 
just beginning treatment. On the other hand, 
these two groups are not statistically different 
when their initial levels of salary are compared. 

Another attempt was made to determine the 
independence of the gain of $29 from changes 
in the economic status of the occupations in New 
York City. The Department of Labor statistics 
show changes in income of people in different 
occupations for the New York City area for 
each year. These figures for the 2-year period 
during which the majority of the 414 patients 
were in treatment showed an expected weekly 
income gain of $6 for occupations listed by the 
department. 

3. There is a tendency for the over-all increase 
for males to exceed that for females, but the 
difference is not statistically significant. Again 
inspection indicates that more male patients 
changed by very large amounts, while more 
female patients had changes at the levels of $11 
to $50 per week. 


(Received November 17, 1966) 
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REPLICATION AND CRITIQUE OF THREE STUDIES ON PERSONALITY 
CORRELATES OF DREAM RECALL: 


BILL DOMHOFF 


University of California, Santa Cruz 


This research attempted to increase the impor- 
tance of findings on ego strength (ES), anxiety 
(A), and repression (R) in three studies of per- 
sonality correlates of everyday dream recall 
(Schonbar, 1959; Singer & Schonbar, 1961; Tart, 
1962). While the findings of these studies were 
consistent despite small Ws and different ages, 
sexes, and methods, the correlations obtained were 
low—.21~.26—except where extreme groups were 
compared, The present study employed a larger 
N and a wider recall scale to ascertain whether 
the variables ES, A, and R might prove to be 
More predictive. In addition, the study sought 
to replicate the most significant correlation un- 
covered in such research, a .68 r between dream 
tecall and need for achievement as measured 
by a rating of an original story and a daydream 
(Singer & Schonbar, 1961). 

The Ss were 84 males and 104 females be- 
tween the ages of 18 and 22 in required intro- 
ductory psychology classes at California State 
College at Los Angeles. The Ss responded to a 
10-point dream-recall questionnaire and then 
took the Barron Es, the Welsh A and R, and 
the Jackson Need-Achievement (N-A). Jackson’s 
(1965) 20-item forced-choice scale has a reli- 
ability of .91 and an r of .52 with judges’ ratings 
of N-A in a behavioral situation. 

The results obtained corroborate Singer- 
Schonbar and Tart on the magnitude of the 
relationship of ES (—.24 males, —.18 females), 

(18 males and females), and R (—.24 males, 
~.18 females) to recall. This agreement makes 
it probable that higher r’s reported by Schonbar, 


*An extended report of this study may be ob- 
tained without charge from Bill Domhoff, University 
of California, Santa Cruz, or for a fee from the 

erican Documentation Institute. Order Document 
a 9488 from ADI Auxiliary Publications Project, 
hotoduplication Service, Library of Congress, 
ashington, D, C. 20340. Remit in advance $1.25 
or microfilm or $1.25 for photocopies, and make 
checks payable to: Chief, Photoduplication Service, 
ibrary of Congress. 


AND 


ALLAN GERSON 
University of Nevada 


using Cattell measures for ego strength and 
anxiety, are due to exclusion of 14 medium- 
recall Ss and computation of a point-biserial on 
the basis of extreme groups. Finally, a non- 
significant r of .05 with the N-A measure does 
not support Singer-Schonbar, but it does agree 
with a similar negative finding by Schonbar using 
the N-A scale of the Edwards Personal Prefer- 
ence Schedule. 

Disregarding the high r’s of Schonbar for sta- 
tistical reasons, it seems that the r’s between 
dream recall and ES, A, and R encompass very 
little of the variance along the recall dimension. 
If one takes the two variables that correlate 
highest with the criterion and lowest with each 
other, ES and R, and then corrects for attenua- 
tion, this multiple r accounts for only 14% of 
the variance. The multiple r for all three vari- 
ables (uncorrected for attenuation) is a mere .28. 
This result suggests that other variables are con- 
tributing the bulk of the differences between 
recallers and nonrecallers. 

“Noticing” or “valuing” dreams may be two 
of the most important of these variables, because 
nonrecallers often become recallers when studied 
in a sleep-dream laboratory or when asked to 
keep a daily dream log. Moving such easily 
motivated nonrecallers to the recall category 
might increase the predictive power of person- 
ality correlates. 


REFERENCES 


Jackson, D. N. The development and evaluation 
of the Personality Research Form. Unpublished 
manuscript, University of Western Ontario, 1965, 

Scuonsar, R. Some manifest characteristics of Te- 
callers and non-recallers. Journal of Consulting 
Psychology, 1959, 23, 414-418. 

Srvcer, J., & ScHonsar, R. Correlates of daydream- 
ing. Journal of Consulting Psychology, 1961, 25, 
1-6. 

Tarr, C. Frequency of dream recall and some per- 
sonality measures. Journal of Consulting Psychol- 
ogy, 1962, 26, 467-470. 


(Received November 4, 1966) 


431 


Journal of Consulting Psychology 
1967, Vol. 31, No. 4, 432 


EGO STRENGTH AND TYPES OF DEFENSIVE AND 
COPING BEHAVIOR * 


CLORINDA G. HUNTER anp 


University of Cincinnati 


The purpose of this study was to investigate 
the relationship of ego strength, as determined 
by the Barron Ego Strength (Es) scale (Barron, 
1953), to the defense mechanisms of rationaliza- 
tion and denial and the coping mechanism of 
logical analysis (Kroeber, 1963). It was hypothe- 
sized that: (a) low-Zs Ss would use more denial 
responses than would high-Es Ss; (b) high-Es Ss 
would use more rationalization responses than 
low-Es Ss; (c) high-Zs Ss would use more 
logical analysis (coping) responses than would 
low-Es Ss; (d) low-Es Ss would be rated by 
judges as more defensive; (e) judges would cor- 
rectly identify high- and low-Es Ss by examining 
their verbatim responses, 


METHOD 


Classified as either high or low ego strength, 40 
college Ss were individually administered a difficult 
symbolic reasoning test, ostensibly to establish local 
norms. They were asked to score their own papers 
and indicate where they stood on faked norms. In 
a taped, standardized interview, they were then 
asked to explain their “poor performance” on the 
test, The interviews were transcribed, and three 
clinical judges independently rated each transcript 
for the frequency and identification of four types 
of responses—denial, rationalization, logical analysis, 
and other, In addition, the judges each rated the 
5s’ defensiveness and ego strength on a five-point 
rating scale. All ratings were adequately reliable. 


RESULTS 


Although the low-Æs Ss did use more denial 
responses than the high-Zs Ss, the difference 


1An extended report of this study may be ob- 
tained without charge from Leonard D, Goodstein, 
Department of Psychology, University of Cincinnati, 
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from ADI Auxiliary Publications Project, Photo- 
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microfilm or $1.25 for photocopies, and make 
checks payable to: Chief, Photoduplication Service, 
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was not statistically significant, and Hypothesis y 
was not confirmed. Hypotheses 3, 4, and 5 were 
supported in that high-Zs Ss used a significantly 
greater number of coping responses than lowe, 
Ss, and the judges rated low-Es Ss as signifi- 
cantly more defensive than high-Es Ss, Judges 
were able to significantly and correctly identify 
the Ss in the two groups, and there was a highly 4 
significant relationship between Es scores and j 
judges’ ratings. P 
Hypothesis 2 was statistically significant in 
the opposite direction with low-Æs Ss using more 
rationalization responses than high-Es Ss. This 
hypothesis was initially suggested by the pre- 
vious findings of King and Schiller (1960) where 
high-Zs Ss used a significantly greater number 
of rationalization responses than the low-Zs Ss, 
One possible explanation for the disparate results 
between the two studies might be that the two 
samples came from different ends of an ego j 
strength continuum. King and Schiller used male 
traffic violators for their sample, and their Ss! 
overall level of ego strength may well have been 
considerably lower than the college Ss used in 
the present study. JA 
In general, the results show sufficient differ 
ences in the kinds of mechanisms used and the 
ratings made by the judges to support the 
position that high- and low-Zs Ss differ in their 
methods of handling a stressful situation. ‘ 
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EXTROVERSION-INTROVERSION AND THE EFFECTS OF STRESS 
ON THE DRAW-A-PERSON TEST? 


HOWARD A. JACOBSON 
Jefferson Medical College Hospital 


This study is a replication and extension of 
Handler and Reyher’s (1964) study concerning 
the effects of stress on the Draw-A-Person Test 
(DAP) and the relationship between DAP 
anxiety and extroversion-introversion. 

Forty introverts (I’s) and 40 extroverts (E’s), 
female introductory psychology students scoring 
below the 30th and above the 70th percentile 
on the Maudsley Personality Inventory, drew 
a man, woman, and an auto control figure under 
stress (S) and nonstress (NS) conditions. Each 
subject acted as her own control in a counter- 
balanced design, with a 1-week interval between 
testing sessions, Stress was produced by connect- 
ing the subject to a complicated apparatus, in- 
cluding a panel of blinking lights and an EEG. 
The drawings were scored for degree of anxiety. 
Interrater reliabilities ranged from .74 to 1.0 
(Mdn = .86). 

Significant F values (male, 55.6; female, 23.6; 
auto 25.1; p< .01) were obtained for the S-NS 
variable, thus indicating that external stress in- 
creases manifestations of anxiety on the draw- 
ings. Thus, caution should be employed in the 
interpretation of anxiety in the DAP, since it 
might merely represent the influence of a 
Stressful testing session. 

A two-tailed sign test was used to examine the 
effects of stress for each index. For both the 
man and woman drawings 14 of 24 indexes 
Significantly differentiated S from NS drawings. 


*Based on the master’s thesis of the senior author, 
niversity of Tennessee. The authors would like to 
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tained without charge from Howard Jacobson, 
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phia, Pennsylvania, or for a fee from the American 
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For both drawings the “S first” order of presen- 
tation resulted in twice as many significant 
indexes, compared with the “S second” condition. 
Of 19 indexes scored for the auto, 9 differenti- 
ated S from NS drawings. There was more 
shading, hair shading, erasure, reinforcement, and 
emphasis line in the NS drawings. The remaining 
significant indexes differentiated S from NS draw- 
ings in the “expected” direction (present more 
frequently in the S situation). These results are 
consistent with Handler and Reyher’s (1964) 
findings. The S-NS significance levels for the 
three drawings were compared using the Fried- 
man two-way analysis of variance test. Only the 
value obtained for the S to NS condition was 
significant (x,? = 8.88, p<.05). The auto had 
the fewest significant indexes (3 of 19) for the 
S to NS condition; the man, 14 of 24; and the 
woman, 10 of 24. Thus, the auto drawing reflects 
less intrapsychic stress than the figures, but 
only in the S to NS condition. These order dif- 
ferences indicate that females adapt to stress 
situations once the task is known. Males, how- 
ever, appear to be more labile and react to 
stressful situations whenever they are introduced 
(Handler & Reyher, 1964). 

The F values for the I-E differences were not 
significant, An anlysis of individual indexes using 
the sign test indicated that there were no con- 
sistent reactions to stress, except that E’s tended 
to draw larger figures and I’s drew smaller 
figures. These findings contradict those of Wal- 
lach and Gahm (1960), who found that highly 
anxious social I’s were more expansive graphi- 
cally than nonanxious social I’s, while social E’s 
high in anxiety were more constricted than non- 
anxious social E’s. 
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FURTHER EVIDENCE ON THE STABILITY OF THE FACTOR 
STRUCTURE OF THE TEST ANXIETY SCALE FOR CHILDREN ? 


SHEILA FELD anp JUDITH LEWIS 
National Institute of Mental Health 


The Sarason Test Anxiety Scale for Children 
(TASC) has been scored as a unidimensional 
scale, although it was designed to tap several 
aspects of Freud’s anxiety concept. Dunn’s 
(1965) factor analyses showed that the TASC 
is multidimensional for Ss in Grades 4-9. The 
present factor analytic study of second graders 
compares number and content of dimensions of 
the TASC for: (a) both sexes at this grade level 
and (6) our Ss and Dunn’s two older groups 
(Grades 4 and 5; 7 and 9). 

The Ss were 3,867 boys (B) and 3,684 girls 
(G). The TASC was expanded by two items 
without anxiety content to determine S’s aware- 
ness of dreams or thoughts about school when 
at home. Questions were administered orally in 
classrooms; the Ss circled a reply of “Yes” or 
“No.” For each sex, product-moment correla- 
tions were computed, Principal-component factor 
analyses were performed with squared multiple 
correlations to estimate h? and then were rotated 
to normalized Varimax solutions. 

Each rotation yielded four interpretable fac- 
tors. By inspection, the same factor labels were 
assigned for both sexes. Coefficients of factor 
similarity were computed across sex. Commonly 
labeled factors had values of .98 or .99, Test 
Anxiety accounted for the greatest common vari- 
ance (B= 40%; G=39%). The 10 highest 
loadings included 8 of the 12 items that men- 
tioned the word, “test.” Remote. School Concern 
was the smallest factor (B=18%; G= 14%). 
The 11 highest loadings included all 5 items 
describing dreams and 4 of 6 items dealing with 
thoughts about school when at home. The term 


1An extended report of this study may be ob- 
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Health Study Center, NIMH, 2340 University 
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“concern,” rather than anxiety, was chosen be- 
cause of the high loadings of the two atffectively 
neutral items reporting dreams or thoughts about 
school when at home. Poor Self-Evaluation ac- 
counted for similar common variance for B 
(21%) and G (20%). Items with high loadings 
concerned expectations of failure, especially in 
comparisons with other children. Somatic Signs 
of Anxiety accounted for more common variance 
for G (26%) than B (20%). All 5 items with 
somatic referents had highest loadings, followed 
by items about expectations of poor work. 
Coefficients of factor similarity for all paired 
factors from second graders and Dunn’s two 
older groups also showed marked similarity in 
the factor structures across grade levels, Two 
main conclusions were drawn about the multi- 
dimensional structure of the TASC as assessed 
by factor analyses during primary, upper ele- 
mentary, and junior high school grades: There 
is similarity across (a) sexes and (b) the above 
age groups. Factors were interpreted as reflect- 
ing: (a) a subclass of school-evaluation situa- 
tions—formal tests—and (b) several modes of 
anxiety responses to various school evaluations. 
This interpretation of the multidimensionality 
leaves two open issues, Are these modes of anx- 
iety responses specific to school-evaluation situa- 
tions? Are the situations included in the TASC 
a distinctive class of anxiety-eliciting stimuli? 
Research ‘to clarify the relationship of stimulus 
and response components of anxiety should 
focus on including varied anxiety-arousing stim- 
uli in factor analyses, for example, peer accept- 
ance and aggression. Ongoing extensions of the 
Current analyses focus on whether the TASC sub- 
scales show different relationships to socioeco- 
nomic, behavioral, and standardized test data. 
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SHOULD SOME PEOPLE BE LABELED MENTALLY ILL? 


ALBERT ELLIS 
Institute for Rational Living, New York 


The question considered is whether it is proper to label some people mentally 
ill in view of the social discriminations, self-denigration, interference with 
treatment, impeding of social progress, and unscientific close-mindedness which 
may ensue when this kind of labeling is employed. It is shown that it is not 
the labeling process itself which is necessarily harmful, but that if such terms 
as “mental illness” are operationally defined and if the individuals so described 
are not negatively evaluated as persons, it may be possible to employ these 


terms scientifically and usefully. 


For the last two decades there has been 
increasing objection by a number of psycholo- 
gists and sociologists (as well as an even 
greater number of nonprofessional writers) to 
labeling certain people as “mentally ill” or 
“emotionally sick.” Thus, Szasz (1961, 1966) 
has vigorously alleged that the concept of 
mental illness “now functions merely as a 
convenient myth.” Mowrer (1960) has con- 
tended that behavior disorders are manifes- 
tations of personal irresponsibility and sin 
tather than of disease. Whitaker and Malone 
(1953), as well as many other experiential 
and existential psychotherapists, have held 
that emotional disturbance is a rather mean- 
ingless term because practically all therapists 
are just about as sick as their patients. Ken- 
iston (1966) and a number of sociological 
Writers have insisted that individual psycho- 
dynamics are not nearly as important as has 
Commonly been assumed in the creation of 
human alienation and insecurity, but that our 
technological society itself lays the ground- 
Work for the growing estrangement of young 
People and, to one degree or another, makes 
Us all emotionally aberrant. 

The question of whether some individuals 
are especially “mentally ill” and should be 
clearly labeled so is of profound importance, 
Since it affects decision making in the areas 
of hospitalization, imprisonment, psychother- 
apy in the community, vocational training and 
Placement, educational advancement, and 
Many other aspects of modern life. Siegel 
(1966) has recently reported that high school 
Students who are hospitalized for emotional 
disturbance or who undertake psychotherapy 
without hospitalization, are frequently held to 
poor risks for higher education and are 


consequently refused admittance to college. 
Obviously, labeling a person “mentally ill” 
has more than theoretical import. 

To my knowledge, no dispassionate discus- 
sion of both sides of this question has yet 
been published. I shall, therefore, try to list 
the main disadvantages and advantages of 
labeling certain people “mentally ill,” so that 
psychologists in general and psychotherapists 
in particular may be better able to see and 
cope with this problem. The main issues that 
have recently been raised in connection with 
diagnosing individuals as “emotionally sick” 
involve (a) social discrimination against the 
“mentally ill,” (b) self-denigration by dis- 
turbed people, (c) moral responsibility and 
“mental illness,” (d) prophylaxis and treat- 
ment of aberrant individuals, (e) social prog- 
ress and emotional disturbance, and (f) sci- 
entific attitude and advancement in regard to 
labeling people “mentally ill.” 


Social Discrimination against the “Mentally 
TE? 

There are several discriminatory practices 
which seem to be inevitably connected with 
labeling an individual as neurotic, psychotic, 
or emotionally disturbed. When so diagnosed, 
either officially or semiofficially, he is often 
discriminated against in some practical ways 
—is refused jobs, kept out of schools, re- 
jected as a love or marriage partner, etc. This 
discrimination is entirely unjust in many 
cases, since the sick individual is not given a 
chance to prove that he can succeed voca- 
tionally, educationally, or otherwise. In some 
instances, a person who behaves unconven- 
tionally or idiosyncratically may be adjudged 
psychotic and may be forcibly hospitalized. 
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Consequently, his—and everyone else’s—free- 
dom of speech may be restricted by his in- 
carceration or threat thereof, Siebert (1967) 
has noted in this connection: 

The thing that has pained me for so long is that, 
while Americans will go to extreme lengths to pro- 
tect a person’s right to speak, there is really very 
little freedom in this country to express all of one’s 
thoughts. I talked to many, many people in mental 
hospitals who were placed there because they re- 
vealed some personal thoughts to a relative or to a 
psychiatrist. Few citizens realize how easy it is to 
lock up a person who has “undesirable” thoughts 
[p. 11]. 

Practically all psychological labels today 
are inexact. What is more, they keep chang- 
ing from diagnostician to diagnostician and 
from decade to decade. Thus, most of the 
patients whom Freud called neurotic would 
today be designated as borderline psychotic 
or schizophrenic reaction. Yet, once a person 
is psychiatrically labeled, he is treated as if 
that label were indubitably correct and as if 
it accurately describes his behavior. His re- 
maining inside or outside of a mental institu- 
tion, being employed or unemployed, or re- 
maining married or unmarried may depend on 
the particular kind of labeling done by a given 
psychologist or psychiatrist who is in a certain 
mood at a special time and place. 

Labeling some people as emotionally dis- 
turbed tends to set up a caste system, with 
consequent social discriminations. In most 
communities of our society, so-called healthy 
individuals are socially favored over the “men- 
tally sick.” But in some groups—Bohemian, 
hippie, criminal, or drug-taking groups—the 
reverse may be true, and the sick individual 
may be considered “in” and may be favored 
over the “square.” 

As an escapee from a New York mental 
hospital points out (Anonymous, 1966), in- 
dividuals who commit clearly illegal acts, such 
as trespassing on others’ property and refus- 
ing to support their wives, may be discrimi- 
nated against once they are judged to be 
“mentally ill” by not being held morally re- 
sponsible for their acts and not being given a 
stipulated prison term for committing these 
acts, but, instead, being indefinitely commit- 
ted to a mental institution. These individuals 
are thus deprived of their moral (or immoral) 
choices and of being held accountable for such 
choices. 
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Our psychiatric terminology itself, as Da- 
vidson (1958) and Menninger (1965) indi- 
cate, is highly pejorative. Referring to people 
with behavior problems by such designations 
as “anal character,” “sadistic,” “castrating,” 
“infantile,” “psychopathic,” and “schizo- 
phrenic” hardly helps their states of mind 
and adds grave doubts to the attitudes of life 
insurance companies, social clubs, officer 
groups, and other organizations about their 
eligibility. Nor, as Menninger (1965) points 
out, 
is the patient, or ex-patient, the only sufferer from 
this situation. An entire family can be hurt by the 
diagnostic label attached to one of its members, 
because of the various implications such labels have 
in the minds of the various groups of people with 
whom that family comes in contact [p. 45]. 

With the very best intentions, then, psy- 
chologists and psychiatrists who are instru- 
mental in labeling individuals as “mentally 
ill” may unwittingly subject these individuals 
to a variety of social and legal discrimina- 
tions and may seriously interfere with their 
civil and their human rights. And not all 
psychiatric intentions are the very best! Red- 
lich and Freedman (1966), while favoring 
involuntary commitment of psychotics in 
many instances, admit that “Certainly, com- 
mitments in many cases are entirely rational 
acts; however, in some cases there is evidence 
that psychiatrists and other involved persons 
are motivated, in part, by counteraggression 
toward very provocative patients [p. 780].” 
So, quite apart from the contention of groups 
helping ex-mental patients (during the last 
two decades) that many Americans have been 
and still are being railroaded by their rela- 
tives into institutions when they are not truly 
disturbed, there seems to be considerable evi- 
dence that commitment procedures leave much 
to be desired and that various discriminatory 
mistakes are made in this connection. 

There is, however, another side to the story. 
Some individuals in our society, whatever We 
choose to call them, are clearly unfit to live 
unattended in the community—as even Szasz 
(1966) admits. Many of them should, pet 
haps, best be placed in regular prisons, S 
though today that solution is hardly ideal 
Others, such as those who have committed 2 
crimes but are obviously on the brink © 
harming themselves and/or other people, ca® 


f 


: 


SHoutp Some Propre Br LABELED MENTALLY ILL? 


hardly be incarcerated in jail, nor can they 
even properly be given determinate sentences 
in a mental hospital. If their behavior is 
sufficiently aberrant, they may well have to 
be placed in some kind of protective custody 
for an indeterminate period, and what better 
place do we have for this kind of treatment 
than a mental institution? 

The main point here is that labeling an 

individual as “mentally ill,” and thereby be- 
ing enabled to send him for therapy either in 
a suitable institution or as an involuntary 
patient in his own community, frequently sub- 
jects him to unfair legal and social discrimina- 
tion. Nonetheless, many other people, and 
sometimes this individual himself, may be 
unfairly discriminated against if this kind of 
procedure is not in some way followed. Take, 
for example, the case of a suicidal individual. 
Morgenstern (1966) states: 
Since suicide is not only irrational—it punishes one- 
self for rage directed at others—but is also irrevo- 
cable, the psychiatrist and society have the human 
obligation to force reconsideration. All of us are at 
times tempted to do the irrational and the irrevoca- 
ble, and I would doubt that, having been stopped, 
we were ungrateful [p. 4]. 

The seriously disturbed person, in other 
words, may well be unfairly discriminating 
against himself, even to the point of irrevo- 
cably harming himself in some major ways. 
Is it not, therefore, fair under these condi- 
tions to judge him ill and forcibly restrain 
him from his self-sabotaging, even at the ex- 
pense of possibly discriminating against him 
in other ways? 

Granted that this question may have no 
Utterly agreed-upon, clear-cut answer, here is 
another that warrants asking: Assuming that 
legal and social discriminations may accrue 
to the individual who is labeled “mentally ill,” 
's it not sometimes necessary to discriminate 
against him in this manner in order to prevent 
him from needlessly harming others? Mrs. 
Hyman Brett (1966), in a letter to the New 

ork Times following its publication of 
Szasz’ article, “Mental Ilness is a Myth” 
(1966), puts this question in more detail: 
What about the freedom and the liberties of the 
relatives of the mentally ill person who consistently 
Tefuses care? At the same time that we refuse to 
tamper with the mentally ill person’s freedom are 


we not tampering with theirs? By returning the 
mentally ill member to his family we are chaining 
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his relations to a life of dread, despondency, and 
frustration, When we allow the neurotic or psy- 
chotic the freedom to reject care we are allowing 
him at the same time another very special freedom: 
the freedom to drive his family over the border line 
into the realm of mental illness, too. For though his 
condition may not be a danger to society, it is a 
very grave and definite threat to the emotional sta- 
bility of the members of his family [p. 4]. 

Mrs. Brett may exaggerate here, since fam- 
ily members of a “mentally ill” individual 
may, at least to some extent, choose whether 
or not to be unduly influenced by his illness, 
Her general point, however, seems to have 
some validity. For in giving a highly disturbed 
person his full civil rights, we may easily im- 
pinge upon those of others whom he may 
incessantly annoy, frustrate, maim, and even 
kill, his behavior ranging from playing his 
radio very loudly all night to mowing down 
some of his neighbors with a machine gun. 
Just as the protection of the civil rights of 
Jews or Negroes does not extend to their 
rights to libel, injure, or slay non-Jews and 
non-Negroes, so may the civil rights of highly 
idiosyncratic individuals have to be curtailed 
when they infringe upon the similar rights of 
not-so-idiosyncratic others. 


Self-Denigration by Disturbed People 

Perhaps the most pernicious aspect of a 
person’s being labeled “mentally ill” is that 
he not only tends to be denigrated by other 
members of his social group, including even 
the professionals who diagnose him, but also 
that he almost always accepts their estima- 
tions of himself and makes them his own. 
This is exceptionally unfair and pernicious; 
even if he can unmistakably be shown 
to be disturbed, he is obviously not entirely 
responsible for being so, but has been born 
and/or reared to be sick and is not to be 
condemned for his state of being. 

Tt is true that an individual, unless he is 
in a state of complete breakdown, is some- 
what responsible for his acts, since he per- 
formed or caused them and usually has some 
degree of choice in doing or not doing them. 
Not every psychotic murders, and under the 
old McNaughten rule there was some justifi- 
cation for our courts holding certain dis- 
turbed people responsible for their crimes, as 
long as it could be shown that they were 
aware of what they were doing when they com- 
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mitted these crimes and that they had some 
choice in their commission, There is no rea- 
son, however, why even thieves and murderers 
have to be condemned in toto or held to be 
worthless persons for their misdeeds. They 
are, like all of us, intrinsically fallible humans 
and to demand that they (or we) be infallible 
is unrealistic. They, moreover, are much dif- 
ferent from and greater than their perform- 
ances, and although we can legitimately mea- 
sure and evaluate an individual’s products, 
there is no way—as Hartman (1959, 1962) 
has shown—of accurately assessing his self. 
Finally, when we do assess a person as a 
whole for his performances, we inevitably 
make it impossible for him to have self-re- 
spect; for as soon as he does something 
wrong, which, being fallible, he soon must, we 
label Aim as bad and, thereby, strongly imply 
that as a bad person he has no other choice 
than to keep doing wrong acts again and 
again (Ellis, 1962). 

This is what frequently happens when we 
pejoratively label an individual “mentally ill.” 
Instead of indicating to him that some of his 
behavior is inefficient or mistaken, we insist 
that he is psychotic or sick, whereupon he 
logically concludes that he is probably un- 
able to do anything efficiently or right, gives 
in to his illness, and keeps perpetuating in- 
effectual behavior that he actually has the 
ability to change or stop. To the degree that 
he feels denigrated by the label of “mental 
illness,” he is likely to feel hopeless about 
acting in anything but a sick manner and 
likely to continue to act in a negative man- 
ner that is congruent with this label. Self- 
deprecation, as practically all psychologists 
and professionals agree, is one of the main 
causes of disturbed behavior. Labeling an indi- 
vidual as emotionally ill or schizophrenic 
often tends to exacerbate this cause. 

It must be admitted, on the other hand, 
that people in our society are predisposed to 
condemn themselves in toto when they per- 
ceive that their performances are wrong or 
ineffective and that one of the best ways to 
help them to ameliorate or stop their self- 
denigration is to show them that they are 
basically immature or sick, They then are 
likely to conclude either that they are not 
truly responsible for their misdeeds or that 
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even though they are responsible, they are 
not-to be blamed or condemned. It is perhaps 
a sad commentary on our society that the only 
individuals who are not consigned to ever- 
lasting Hell for their sins are little children 
and sick adults, but the fact is that we do 
largely exonerate “mentally ill” people for 
their misdeeds and forgive them their sins, 
Until society’s attitudes in this respect sig- 
nificantly change, labeling a person “ill” has 
distinct advantages (as well as disadvantages) 
in minimizing his self-denigration. 


Moral Responsibility and “Mental Illness” 


Mowrer (1960) and Szasz (1961, 1966) 
have persuasively argued that if we cavalierly 
and indiscriminately label an individual “men- 
tally ill,’ we are thereby glossing over the 
fact that he is still responsible for a good 
deal of his behavior, that it is quite possible 
for him to change his performances for the 
better, and that (in Mowrer’s terms) he is 
not likely to improve his condition until he 
fully acknowledges his sins and actively sets 
about making reparations and correcting 
them. By focusing on the illness of certain 
individuals, these writers would contend, we 
give them rationalizations for being the way 
they are and fail to teach them how to mod- 
ify their self-destructive and immoral deeds. 

Ellis (1962), Glasser (1965), Morgenstern 
(1966), and various other psychotherapists 
have recently emphasized the point that peo- 
ple are personally responsible for the social 
consequences of their behavior and that unless 
they admit that they can largely control their 
own destinies, in spite of the strong parental 
and societal conditioning factors that existed 
during their childhood, they are not likely to 
change their ineffectual behavior. As Morgen- 
stern (1966) points out, labeling a person as 
“mentally ill” and involuntarily committing 
him to a mental institution frequently “rein- 
forces the immature wish to avoid this re- 
sponsibility, by blaming the illness for failure 
to achieve desired goals [p. 4].” 

As usual, however, there is another side to 
the story. Ausubel (1961) heartily concurs 
with Mowrer that “personality disorders stn? 
can be most fruitfully conceptualized as 
products of moral conflict, confusion, gan 
aberration [p. 70],” but he seriously ques 
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tions the notion that these disorders are ba- 
sically a reflection of sin; he demonstrates 
that most immoral behavior is committed by 
individuals who would never be designated as 
ill or disturbed and that many people who dis- 
play disordered behavior are not particularly 
sinful or guilty. Moreover, Ausubel points out 
that not all “mentally sick” persons are truly 
responsible for their behavior: 

It is just as unreasonable to hold an individual 
responsible for symptoms of behavior disorder as 
to deem him accountable for symptoms of physical 
illness. He is no more culpable for his inability to 
cope with socio-psychological stress than he would 
be for his inability to resist the spread of infectious 
organisms. In those instances where warranted guilt 
feelings do contribute to personality disorder, the 
patient is accountable for the misdeeds underlying 
his guilt, but is hardly responsible for the symptoms 
brought on by the guilt feelings or for unlawful 
acts committed during his illness. . . . Lastly, even 
if it were true that all personality disorder is a 
reflection of sin and that people are accountable for 
their behavioral symptoms, it would still be un- 
Necessary to deny that these symptoms are mani- 
festations of disease. Illness is no less real because 
the victim happens to be culpable for his illness. A 
glutton with hypertensive heart disease undoubtedly 
aggravates his condition by overeating and is culpa- 
ble in part for the often fatal symptoms of his 
disease, but what reasonable person would claim that 
for this reason he is not really ill [pp. 71-721? 


Prophylaxis and Treatment of Aberrant 
Individuals 


In several important ways labeling an indi- 
vidual as “mentally ill” may interfere with 
the treatment of any behavior problem he 
may display and may hinder the prevention 
of emotional disorder. For example: 

1. Calling a person “mentally sick” fre- 
quently enhances his feelings of shame about 
his “illness,” so that he defensively refuses to 
admit that he has serious behavior problems 
and therefore does not seek help with these 
Problems. 

_ 2. A person who is set apart as being emo- 
tionally aberrant may become so resentful of 
this kind of segregation that he may refuse to 
acknowledge his “persecutors’ ” efforts to help 

im and may get into hostile encounters with 
them and others that only serve to increase 

$ living handicaps, 

, 3. In many instances, the “mentally ill” 
Individual is forcibly incarcerated in an insti- 
tution where he is kept from doing many 
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things he enjoys and where his condition may 
become aggravated rather than improved. 

4. Labeling a person as psychotic may eas- 
ily imply, to himself and those who may be 
able to help him, that he is hopeless and that 
little can be done to get him to change his 
behavior. As Menninger (1965) indicates, 
psychological treatment today is carried out 
by many people in addition to psychologists 
and psychiatrists, and the cooperation of 
family members is often urgently needed. 
“Schizophrenia” and “mental illness” are such 
impressive labels that they induce many peo- 
ple to feel that only highly trained profes- 
sionals, if indeed anyone, can work with sick 
people and to ignore the fact that less trained’ 
individuals can often be specifically shown 
how to help troubled humans. 

5. By being encouraged to label other peo- 
ple as sick, many of us fail to consider ade- 
quately our own problem areas. If we are not 
seen as being totally ill, we easily assume 
that we have few or no shortcomings; when 
we can easily label others as neurotic or psy- 
chotic we tend to assume that we are not in 
the least in such a class. By an all-or-none 
labeling technique, we tend to gloss over our 
own correctable deficiencies. 

6. Labeling individuals as “mentally ill” 
often bars them from various social, voca- 
tional, and educational situations where they 
would best learn how to help themselves. It 
sometimes interferes with adequate research 
into treatment, while focusing on more pre- 
cise research into diagnosing or labeling. It 
consumes psychological and psychiatric man- 
power which might better go into treatment. 

7. If people have close relatives who are 
labeled psychotic, they sometimes become so 
afraid of going insane themselves that they 
actually bring on symptoms of disturbance 
and begin to define themselves as “mentally 
ill.” 

On the other side of the ledger, if we have 
a clear-cut concept of “mental disease” and 
if we unequivocally refer to certain kinds of 
behavior as neurotic or psychotic, many bene- 
fits in preventing and treating “emotional 
disturbance” are likely to accrue, For in- 
stance: 

1, If needlessly self-defeating and: overly 
hostile behavior does exist and is to be fought 
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and minimized, the individual who exhibits it 
has to acknowledge (a) that it exists and (b) 
that he is to some degree responsible for its 
existence and, hence, can change it. This is 
what we really mean when we say that an 
individual is “mentally ill”—that he has 
symptoms of mental malfunctioning or illness. 
More operationally stated, he thinks, emotes, 
and acts irrationally and can usually uncon- 
demningly acknowledge and change his acts. 
If this, without any moralistic overtones, is 
the definition of “mental illness,” then it can 
distinctly help the afflicted individual to ac- 
cept himself while he is ill and to work at 
changing for the better. 

2. When an individual fully accepts the fact 
that he is emotionally disturbed, he often 
starts to improve (Redlich & Freedman, 1966). 
Why? Because (a) to some extent he knows 
why he is behaving ineffectively; (b) he can 
begin to define in more detail exactly what 
his sickness consists of and what he is doing 
to cause and maintain it; (c) he may accept 
his symptoms with more equanimity and tend 
to be less guilty about creating them; (d) he 
may be much more inclined to seek profes- 
sional help, just as he would if he were physi- 
cally ill. 

3. By accepting the concept of “mental ill- 
ness,” a person can often accept and help 
others who are neurotic or psychotic. I have 
seen many parents with highly disturbed 
children who, after learning that their child’s 
peculiar behavior is the result of a deep- 
seated disturbance which is biologically as 
well as environmentally rooted, became 
enormously less guilty and were able to sym- 
pathetically accept their child and do their 
best to help him ameliorate his symptoms. 

4. There is an essential honesty about the 
full acceptance of states of “emotional illness” 
that is itself often curative. In the last analy- 
sis, almost all neurosis and psychosis consists 
of some fundamental self-dishonesty (Glas- 
ser, 1965; London, 1964; Mowrer, 1960, 
1964) or some self-deceptive defense that one 
raises against one’s perfectionistic and gran- 
diose leanings (A. Freud, 1948; S. Freud, 
1963). When, therefore, one fully faces the 
fact that one is “mentally ill,” that this is 
not a pleasant way to be, and that one is 
partially responsible for being so, one becomes 
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at that very point more honest with oneself 
and begins to get a little better. 

5. Accepting the fact that he is emotionally 
sick may give an individual an incentive to 
improve his lot. Most confirmed homosexuals 
in our society utterly refuse to admit that 
their homosexuality is a symptom of disturb- 
ance (Benson, 1965; Wicker, 19661). They 
mightily inveigh against clinicians such as 
Adler (1917), Bieber et al. (1962), and Ellis 
(1965a), who insist that they are sick. As a 
result, relatively few mixed homosexuals come 
for psychotherapy, and of those who do come 
only a handful work to change their basic 
personality structure and to become hetero- 
sexually interested and capable. At the same 
time, many phobiacs admit their disturbance, 
come for therapy, and are significantly helped 
(Redlich & Freedman, 1966; Wolpe, 1958). 
This is not to say that all those who accept 
the idea of their being “mentally ill” work 
hard at becoming better. Far from it! But 
their chances are often improved, compared 
to those who insist that they are no more 
disturbed than is anyone else. 

6. Psychotherapists are often more effective 
when they face the fact that their patients 
are “mentally ill.” When they look upon 
these patients as merely having behavior 
problems, they work moderately hard with 
them and often become disillusioned at the 
poor results obtained. When they acknowl- 
edge that their patients often have basic, 
deep-seated emotional disorders, they know 
they are in for a long hard pull, work with 
greater vigor, expect many setbacks and lim- 
ited successes, and take a realistic rather than 
an over-optimistic or over-pessimistic thera- 
peutic view. A 

7. Whether we like it or not, it sometimes 
seems to be necessary for some individuals to 
be adjudged “mentally ill” and even to be 
forcibly incarcerated, if they are to be treated 
effectively. A dramatic case in point is the 
recent one of the Texas resident, Charles 
Whitman, who killed 16 innocent bystanders 
shortly after he had gone for one interview 
with a psychiatrist and failed to return for 
further treatment, although he was found to 

1R. Wicker. Statement made on the Larry Glick 


Show, radio station WMEX, Boston, January 8, 
1966. 
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be potentially homicidal. Redlich and Freed- 
man (1966) remark: 

As therapeutic interventions increase in intensity 
and scope, we more frequently encounter the ques- 
tion of a person impulsively leaving treatment when 
there appears to be a good chance that he could 
further improve his status and diminish his self- 
destructive behavior. Without some element of re- 
straint, such a person might not have received 
therapeutic help at all. Nonetheless, it is probably 
best, both for society and for therapy of the pa- 
tient, that coercion be restricted to the minimum 
necessary for the protection of life [p. 782]. 

Redlich and Freedman note how difficult it 
often is, as in the case of James Forrestal, 
Secretary of the Navy, who committed sui- 
cide while under psychiatric observation in a 
naval hospital, to adequately supervise per- 
sons of high position and eminence who are 
seriously disturbed. While their book was 
going through the press, Hotchner’s (1966) 
Papa Hemingway appeared. According to 
Hotchner, Hemingway, because of his literary 
genius, was treated with unusual leniency by 
psychiatrists at the Mayo Clinic, and the day 
after he returned home from the Clinic he 
shot and killed himself. There is little doubt 
in Hotchner’s mind that Hemingway might 
have lived for many more years if he had 
been honestly adjudged “mentally ill” and 
had been involuntarily treated. 

8. If the facts of “mental illness” are forth- 
tightly faced and it is recognized that nu- 
merous individuals in our population are pre- 
disposed, for biosocial reasons, to be severely 
disturbed, educational prophylaxis will tend to 
be stressed. For if none of us is truly sick, 
just because all humans have some problems 
of adjustment, it seems futile to teach people 
the principles of mental hygiene, methods of 
sound thinking about themselves, and ways 
of coping with reality. But if it is accepted 
that all of us are a bit “touched” and that 
Some of us are more so, greater efforts toward 
Prevention of “mental illness” may become 
the rule. 

, 9. If the concept of emotional disturbance 
1s admitted, proper surveillance of predisposed 
Individuals can be instituted for preventive, 
Protective, and curative reasons. Thus, if a 
child or adolescent is known to have tenden- 
cles toward severe illness, he can be specifi- 
cally watched to see when these are breaking 
out. He can be kept out of situations where 
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he may inflict damage on others, can at times 
be placed in protective custody to safeguard 
himself and others, and can be regularly 
treated to minimize his sick tendencies, In 
this respect, I recall a patient who was re- 
ferred to me by a psychologist almost 20 years 
ago because, although he was only moderately 
disturbed, his twin brother had just been 
institutionalized with a diagnosis of paranoid 
schizophrenia. I saw this patient steadily for 
a couple of years and since that time have 
been seeing him a few times a year. I believe 
that it is largely as a result of my treating 
him and seeing him through a number of 
incipient crises during these years that he 
has been helped to remain only moderately 
ineffective and never to be in danger of a 
serious break, although in my opinion he is 
clearly a borderline schizophrenic. Similarly, 
other incipient psychotics can, if recognized 
early enough, be helped to remain perennially 
incipient and prevented from overtly break- 
ing down. 


Social Progress and Emotional Disturbance 


Tf we label people who display various ad- 
justment problems or idiosyncratic ways of 
living as “mentally ill,” we may impede social 
progress in various ways. Many of the world’s 
great statesmen, innovators, and creative 
artists have been “crackpots” who might well 
have been diagnosed as neurotic or psychotic ' 
and whose contributions to the world could 
have been (and in some cases actually were) 
sadly curtailed because of such labels. Thus, 
Dorothea Dix, who helped reform our mental 
hospital procedures, was opposed because she 
was deemed a “screwball,” and Richard Wag- 
ner had difficulty getting some of his works 
performed because he was considered a “mad- 
man.” In our own way, highly qualified peo- 
ple may not be elected to public office because 
of their unconventional and “crackpotty” 
views. Diplomats may not take with sufficient 
seriousness the statements of the Hitlers of 
the world because these leaders are seen as 
maniacs. Notable inventions may go unused 
because their inventors are considered “crazy.” 

Actually, an individual’s aberrant or pe- 
culiar characteristics may have distinct ad- 
vantages as well as disadvantages. Rank 
(1945, 1958) held that what is normally 
called neurosis is a creative process that may 
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lead to beneficial and exciting aesthetic pro- 
ductions, and several other writers have noted 
the creative aspects of some psychotic states, 
but once an idiosyncratic individual in our 
society is labeled “mentally ill,” it is assumed 
that his illness is wholly pernicious and that 
it must quickly be interrupted and abolished. 

The very concept of illness or disease, as 
applied to emotional malfunctioning, may be 
socially retrogressive, since it limits thinking 
in this area. As Albee (1966), Rieff (1966), 
and several other students of mental health 
have recently shown, the medical or disease 
model of human disorder is restrictive and 
misleading, in that it implies that the afflicted 
individual has a specific handicap caused by 
a concrete organism or event and that his 
troubles can fairly easily be diagnosed and 
cured, as is the case in many physical dis- 
orders. Actually, what has been called “men- 
tal illness” appears to have multifarious caus- 
ative factors and appears to be interrelated 
with the individual’s entire existence and his 
global philosophy of life. It is therefore best 
understood and attacked on a philosophical, 
sociological, and psychological level rather 
than a narrow medical level, and those who 
practice psychotherapy (in itself a bad word 
because of its medical origins and implica- 
tions) would aid their patients (another medi- 
cal term!) in particular and the art of mental 
healing (!!) in general if they forgot about 
the illness or disease aspects of ineffectual 
behavior and focused in a more global way 
on the causes and amelioration of such be- 
havior. 

Viewing disorganized thought, emotion, and 
action as “mental illness” may again limit 
social and psychotherapeutic progress by 
supporting the concomitant view that only 
psychiatrists and other physicians are truly 
equipped to treat the emotionally disturbed, 
when, actually, some of the best theoreticians 
and practitioners in the field have been psy- 
chologists, social workers, marriage counselors, 
clergymen, and various other kinds of non- 
medical workers. Social progress is at present 
probably being seriously hampered in the 
field of mental health by professional opposi- 
tion to nonprofessionals, such as intelligent 
housewives and college students, who have 
been found to be quite helpful with sick indi- 
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viduals but who have often been kept from 
doing very much in this respect because their 
patients are designated as being “mentally 
ill” (Ellis, 1966). 

As usual, much can be said in opposition to 
the view that diagnosing people as “emotion- 
ally sick” tends to hinder social and thera- 
peutic progress. First, there is no good evi- 
dence to support Rank’s (1945, 1958), view 
that neurosis is a creative process and that it 
should be cherished if artists and their public 
are to continue to make great progress. Nor is 
there any reason to believe that many of the 
outstanding innovators of the past and pres- 
ent would not be ignored and opposed by 
their contemporaries even if the latter could 
not call them “mentally ill” or “crazy.” 

As for the concept of “mental disease” aid- 
ing social reaction and blocking therapeutic 
progress, Menninger (1965) points out that 
modern medicine is not atomistic but holistic 
and that good physicians see disease in a 
broad, almost nonmedical (in the old sense of 
the term) way. He quotes Virchow, “Disease 
is nothing but life under altered conditions,” 
and Engel, “Disease corresponds to failures 
or disturbances in the growth, development, 
functions, and adjustments of the organism 
as a whole or of any of its systems,” (Men- 
ninger, 1965, p. 460) to show that the medical 
model of “mental illness” that Albee (1966) 
so severely criticizes is no longer typical of 
modern psychiatrists. 

Ausubel (1961, p. 70) contends that to 
label personality disorder as disease not only 
would not hinder social and therapeutic prog- 
ress but that the Szasz-Mowrer view of the 
“myth of mental illness” would “turn back 
the psychiatric clock twenty-five hundred 
years.” The most significant and perhaps the 
only real advance registered by mankind in 
evolving a rational and humane method of 
handling behavioral aberrations has been in 
substituting a concept of disease for the 
demonological and retributional doctrines Te- 
garding their nature and etiology that flour- 
ished until comparatively recent times. Con- 
ceptualized as illness, the symptoms of pêr- 
sonality disorders can be interpreted in the 
light of underlying stresses and resistances, 
both genic and environmental, and can be 
evaluated in relation to specifiable quantita- 
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tive and qualitative norms of appropriately 
adaptive behavior, both cross-culturally and 
within a particular cultural context. It would 
behoove us, therefore, before we abandon the 
concept of mental illness and return to the 
medieval doctrine of unexpiated sin or adopt 
Szasz’ ambiguous criterion of difficulty in 
ethical choice and responsibility, to subject 
the foregoing proposition to careful and de- 
tailed study. 

Ausubel (1961, p. 69) also points out that 
labeling individuals with aberrant behavior 
“mentally ill” does not preclude nonmedical 
personnel from helping these individuals, since 
“an impressively large number of recognized 
diseases are legally treated today by both 
medical and non-medical specialists (e.g., 
diseases of the mouth, face, jaws, teeth, eyes, 
and feet).” Consequently, even if we maintain 
the concept of “mental illness,” we can justi- 
fiably allow and encourage all kinds of pro- 
fessionals and nonprofessionals to treat the ill. 


Scientific Advancement and the Label of 
“Mental Illness” 


There would seem to be several impedi- 
ments to the use of the scientific method and 
to the advancement of science when we label 
individuals “mentally ill.” For one thing, this 
kind of labeling leads to over-categorization 
and higher-order abstracting, which obscures 
Scientific thought and leads to countless hu- 
man misunderstandings (Korzybski, 1933, 
1951). To say that an individual is bad be- 
Cause his behavior is poor is to fabricate a 
sadly overgeneralized and invariably false 
description of him, as it is most unlikely that 
al his behavior—past, present, and future— 
was, is, or will be poor. Similarly, to label a 
Person as a genius is to describe loosely and 
inaccurately, because it is likely that (at 
Most!) he displays certain aspects of genius 
în only some of his productions—even if his 
name is Leonardo da Vinci; it is most proba- 
ble that in many or most of the other aspects 
of his life, for example, his playing pingpong, 
making love, and cooking a soufflé, he is far 
from displaying many aspects of genius (El- 
lis, 1965b). 

, This kind of overgeneralizing distorts real- 
ity and causes the unrealistic (and often un- 
fair) condemnation or deification of a human 
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as a whole for relatively isolated parts or 
aspects of his functioning. Just as an individ- 
ual’s good deeds do not prove that he, on the 
whole, is a genius, so his bizarre or dysfunc- 
tional acts fail to show that he is totally 
“mentally ill” or incompetent. Designating 
him in this manner may, therefore, lead to 
misapprehension and misunderstanding of his 
sick and healthy behavior. 

Labels of all kinds promote close-minded- 
ness rather than open-minded, experimental, 
scientific attitudes, Calling an individual 
“mentally ill” tends to put him in a niche, 
from whence his removal may never be con- 
sidered. It encourages us to diagnose an indi- 
vidual’s condition and then to forget about it 
because it has been neatly categorized, to 
rigidify our thinking in the field of mental 
health itself, and to help us forget that the 
patient’s “illness” is more of a hypothesis 
than a well-established fact. 

Szasz (1961) has contended that the con- 
cept of “mental illness” is antithetical to sci- 
ence because it is demonological in nature, in 
that it follows the lines of religious myths in 
general and the belief in witchcraft in par- 
ticular and because it uses a reified abstrac- 
tion, “a deformity of personality,” to account 
causally for disordered behavior and human 
disharmony. Many other writers, such as Ellis 
(1950) and LaPiere (1960), have held that 
the Freudian terms, in which most forms of 
emotional disturbance are put today (e.g. 
“weak ego” and “punishing superego”), are 
reifications that have no actual substance 
behind them and are hence mythical and mis- 
leading entities. The entire field of “mental 
health” appears to be replete with these kinds 

f myths, 

Heats of these objections to the diag- 
nosis of “mental disease” are important (and 
others seem to be trivial), there is much to be 
said in favor of the notion that categorizations 
of this sort are, when carefully made, reason- 
ably accurate and quite helpful to the cause 
of scientific advancement, Arguments in this 
connection include the following: 

1. Although it is inaccurate to state that 
the individual in our culture who is usually 
labeled “mentally ill” is a much different kind 
of person from the healthy individual, or that 
he exhibits entirely aberrant behavior, or that 
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he is a bad or lower kind of person because 
he sometimes behaves oddly, the fact re- 
mains that there is almost always some sig- 
nificant difference between the actions of this 
ill individual and those of another who is 
well. What is more, the existing difference is 
one that can usually (if not always) be de- 
tected by a trained observer, is fairly con- 
sistently evident, and leads to definite behav- 
ior of a self-defeating or antisocial nature. If 
the individual with aberrant behavior is not 
in any way to be labeled “mentally ill,” neu- 
rotic, psychotic, or something similar, the 
peculiarity, undesirability, and improvability 
of his behavior is likely to be overlooked, 
some segment of reality will thereby be de- 
nied, and the essence of science—observation 
and classification—will be rejected. 

2. There is considerable and ever-increasing 
scientific evidence to show that although the 
term “mental illness” itself is vague, the 
major characteristics which are subsumed 
under its rubric, such as compulsion, over- 
suspiciousness, phobia, depression, and in- 
tense rage, do exist and have observable ide- 
ational and physiological correlates. Thus, 
feelings of depression are usually acompanied 
by the individual’s belief that “When I do the 
wrong thing, I am no good and will probably 
always remain worthless,” and “If signifi- 
cant people in my life do not approve of me, 
I can’t approve of myself.” These feelings are, 
in addition, frequently accompanied by fa- 
tigue, poor appetite, insensitivity to stimula- 
tion, ineffective performance, etc. Objectively, 
therefore, some individuals can be described 
as being consistently depressed and in that 
sense, at least, may be thought of as being 
“mentally ill.” 

3. Some kind of general factor of emotional 
distress appears to exist in certain individuals, 
since they are observed to display various 
major symptoms (e.g., hostility, anxiety, and 
depression), while other individuals are prac- 
tically symptom free. Thousands of years of 
observation would seem to attest to the ex- 
istence of this general factor, as many of the 
descriptions of peculiar people in past cen- 
turies are amazingly similar to modern clini- 
cal descriptions. Recently, Moreover, a great 
deal of evidence has accumulated which tends 
to show that people who display severe be- 
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havior problems are to some degree biologi- 
cally different from others (Chess, Thomas, & 
Birch, 1965; Greenfield & Lewis, 1965; Red- 
lich & Freedman, 1966) and that they can be 
reliably selected from the general population 
(Joint Commission on Mental Illness and 
Health, 1961). To ignore this evidence of 
“mental illness” would seem to be highly 
unrealistic; to acknowledge it would be to 
accept people as they truly are. 

4. Although all self-defeating human behav- 
ior may well have elements of social learning 
and may be best understood, as Szasz con- 
tends, by being studied in a sociological con- 
text and in the light of social deviance, the 
fact remains that the individual himself con- 
tributes significantly to what he accepts or 
rejects from his culture and, at times, may 
therefore be justifiably deemed sick or dis- 
ordered. Anyone of us, as Messer (1966) ob- 
serves, may be neurotically influenced by 
dramatic television commercials which con- 
vince us that we have acid indigestion when 
we experience abdominal discomfort. Few of 
us would conclude, however, that the discom- 
fort represents a demon tearing away the 
lining of our stomachs and that unless the 
pain stops we must cut ourselves open to get 
at this demon. Those few, who gratuitously 
add their own distorted perceptions and 
thoughts to their socially imbibed neurotic 
ideas, may justifiably be diagnosed as psy- 
chotic, even though some of their notions 
(e.g., that demons could exist) are partially 
derived from their cultures. 

5. Although we may concede Szasz’ (1961) 
points that what we usually call “mental ill- 
ness” is largely an expression of man’s strug- 
gle with the problem of how he should live 
and that human relations are inherently 
fraught with difficulties, Ausubel (1961) dem- 
onstrates that, 
there is no valid reason why a particular symptom 
cannot both reflect a problem in living and consti- 
tute a manifestation of disease. . . . Some individ- 
uals, either because of the magnitude of the stress 
involved, or because of genically or environmentally 
induced susceptibility to ordinary degrees of stress, 
respond to the problems of living with behavior that 
is either seriously distorted or sufficiently unadap- 
tive to prevent normal interpersonal relations an 
vocational functioning. The latter outcome—sross 
deviation from a designated range of desirable be- 
havior variability—conforms to the generally under- 
stood meaning of mental illness [p. 71]. 


SHOULD Some Propre BE LABELED MENTALLY ILL? 


DISCUSSION 

It would appear that there are important 
disadvantages as well as advantages in label- 
ing people “mentally ill.” Many of the disad- 
vantages result from our tendency to include 
in the terms “mental illness,” “neurosis,” and 
“psychosis” not only a description of the fact 
that the afflicted individual behaves self- 
defeatingly and inappropriate to his social 
group, but also the evaluative element that he 
is bad, inferior, or worthless for so behaving. 
If this evaluative element were not gratui- 
tously added, the term “mental illness,” even 
though an abstraction that is not too precise, 
might have descriptive, diagnostic, and thera- 
peutic usefulness. It is a kind of shorthand 
term which can be used to describe the usual 
and fairly consistent state of a person who 
keeps driving himself to act ineffectually and 
bizarrely. 

Thus, instead of saying, “He is mentally: 
ill,” we could say, “He is a human being who 
at the present time is behaving in a self- 
defeating and/or needlessly antisocial man- 
ner and who will most probably continue to 
do so in the future, and, although he is par- 
tially creating or causing (and in this sense is 
responsible for) his aberrant behavior, he is 
still not to be condemned for creating it but 
is to be helped to overcome it.” This second 
Statement is more precise, accurate, and help- 
ful than the first one, but it is often imprac- 
tical to spell it out in this detail. It is, there- 
fore, legitimate to use the first statement, “He 
is mentally ill,” as long as we clearly under- 
Stand that it means the longer version. 

A good solution, then, to the problem of 
labeling an individual “mentally ill” is to 
change the evaluative attitude which gives the 
term “mental illness” a prejorative tone and 
to educate alll of us, including professionals, to 
accept “emotionally sick” human beings with- 
out condemnation, punishment, or needless 
restriction, This, to some degree, has already 
Occurred, since the attitude that most of us 
take toward disturbed people today is much 
ess negative than that taken by most people 
4 century or more ago; much, however, re- 
Mains to be accomplished in this respect. 

Meanwhile, what is to be done? For psy- 
chologists, psychiatrists, psychiatric social 
Workers, and other professionals, the follow- 
ing conclusions are in order: 
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1. The term “mental illness,” or some simi- 
lar label, is likely to be around for some time, 
even though continuing efforts can be made 
to change current psychological usage, 

2. An individual who is “mentally ill” may 
be more operationally defined as a person 
who, with some consistency, behaves in dys- 
functional ways in certain aspects of his life, 
but who is rarely totally “disturbed” or un- 
controlled. 

3. It is highly dangerous to evaluate a 
“mentally ill” person as you would evaluate 
his acts or performances. If he is sufficiently 
psychotic, he may not even be responsible for 
his acts. If he is less disturbed, he may be 
responsible but not justifiably condemnable 
for his deeds, since they are only a part or 
an aspect of him, and to excoriate him in 
toto for these deeds is to make an unwar- 
ranted and usually harmful overgeneraliza- 
tion about him. 

4, Although most “mentally ill” individuals 
perform bizarre and unconventional acts, not 
all people who perform such acts are sick or 
ill. Neurosis or psychosis exists not because 
of an individual’s deeds, but because of the 
overly anxious, compulsive, rigid, or unreal- 
istic manner in which he keeps performing 
them. 

5. Most “mentally ill” individuals are vari- 
able from day to day and changeable from 
one period of their lives to another. The fact 
that they act inappropriately today does not 
mean that their behavior was equally dysfunc- 
tional yesterday nor that it will be so tomor- 
row. Such people usually have considerable 
capacities for growth and can change radi- 
cally for the better (as well as for the worse). 

6. People, no matter how “mentally ill” 
they may be, are always human. We owe 
them the same kind of general respect that we 
owe to all human beings, namely, giving them 
the rights to survive, to be as happy as possi- 
ble in their handicapped conditions, to be 
helped to function as well as possible and to 
develop their potentials, and to be protected 
from needlessly harming themselves and oth- 
ers. y 

If these approaches to individuals with 
severe emotional problems are kept solidly in 
the forefront of our consciousness and are 
actualized in our relationships with them, the 
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question of whether to label them as “men- 
tally ill” may well become academic. 
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Epitor’s Note 


Because of the important issues raised by Albert Ellis regarding 
the concept of mental illness, the Editor has invited the discussion 


by Theodore Sarbin which follows. 


: 
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ON THE FUTILITY OF THE PROPOSITION THAT SOME PEOPLE 
BE LABELED “MENTALLY ILL” 


THEODORE R. SARBIN 
University of California, Berkeley 


By recognizing the metaphorical nature of “symptoms” and “illness” and the 
hypothetical nature of “mind,” the mythical character of the mental concept is 
exposed. Conclusions lead the author to take a position contrary to Ellis’s: 
Logical canons as well as humanistic value orientations direct us to delete 
“mental illness” from our vocabulary. Such a deletion does not deny that 
persons who engage in certain kinds of norm violations, which Ellis would 
call symptoms of mental illness, present problems to society. How to contain, 
manage, and reform persons judged to be actual or potential violators of 
social norms has been and continues to be one of the fundamental problems 
of social organizations. Creative solutions to such fundamental problems require 
a new set of metaphors and the sustained effort of experts in jurisprudence, 
social engineering, law enforcement, and community psychology. 


The writing of a dispassionate account of the 

current utility of the mental illness concept 
reflects a noble purpose. Ellis (1967), by juxta- 
posing pro et contra arguments, tries to imple- 
ment this purpose. On the one hand, he recog- 
nizes the massive negative utilities that result 
from the use of the mental-illness label; on the 
other hand, he points to occasions where the 
employment of the label appears to have positive 
utility. His studied conclusion is that the label, 
when used by professional diagnosticians in an 
Operational way, identifies a limited number of 
people who are “really mentally ill.” He adds 
the caution that the person who uses the label 
must subtract from it the pejorative components 
that have become part and parcel of the concept. 
Of several definitions, the following is representa- 
tive of Ellis’s viewpoint: 
This is what we really mean when we say that an 
individual is “mentally ill’—that he has symptoms 
of mental malfunctioning or illness. More opera- 
tionally stated, he thinks, emotes, and acts irration- 
ally and he can usually uncondemningly acknowl- 
edge and change his acts. If this, without any moral- 
istic overtones, is the definition of “mental illness,” 
then it can distinctly help the afflicted individual to 
accept himself while he is ¿l . . . [p. 440; italics 
added], 


The general conclusions drawn by Ellis must 
be rejected on logical grounds. They represent 
Not so much a lack of attention to the rules of 
evidence (to be mentioned later) as the accep- 
tance of an entrenched and unwarranted belief 
that operates as a major premise. When operative, 
the premise may be stated: The label “mental 
illness” reliably denotes certain forms of conduct 
that are discriminable from forms of conduct 


an may be reliably denoted as “not mentally 


Since Ellis does not establish the ontological 
argument for “mental illness,” his conclusions 
are illicit. That is to say, he assumes the truth 
of the proposition he sets out to demonstrate. 
(Note the italicized phrases in the quotation 
above.) The fundamental question is by-passed; 
to wit, is there a set of observations for which 
the dual metaphor “mental illness” is appropriate? 

Most of Ellis’s (and others’, for example Au- 
subel, 1961) arguments aimed at retaining the 
mental-illness label flow from concealed, tacit, 
and disguised implications now contained within 
the label itself. Further, such arguments do not 
take into account the fact that the choice of 
label not only constrains further descriptive 
elaborations of the conduct under observation, 
but also indirectly restricts alternatives to action. 
The sentence, “a child ...is known to have 
tendencies toward severe (mental) illness .. .” 
contains implications different from “a child has 
tendencies to hit other children.” 

To anticipate a criticism of the semiotic ap- 
proach as a legitimate entrée into the argument, 
let me assert that the choice of a metaphor to 
designate an object or event is not inconsequen- 
tial. Every metaphor contains a wealth of con- 
notations, each connotation has the potential for 
manifold implications, and each implication is a 
directive to action, While metaphors are ordi- 
narily used by people to facilitate communication, 
the peril is always at hand that people may be 
used by metaphors (Turbayne, 1960). Such a 
peril is activated when the user of a metaphor 
ignores, forgets, or purposely drops syntactical 
modifiers (e.g., as if) that denote the metaphor 
and, instead, employs the word in a literal fash- 
ion. To say “Jones is a saint” carries one set of 
implications if we supply the tacit modifier (“It 


447 


448 


is as if Jones is a saint”); the sentence carries 
a radically different set of implications if the 
predicate is treated as literal. The effects of 
permanently ignoring the metaphoric properties 
of a word, that is, of dropping the expressed or 
tacit modifiers, is to hypostatize an entity. Such 
hypostatization sets the stage for myth making. 

Most of Ellis’s arguments topple of their own 
structural defects, defects related to the uncriti- 
cal acceptance of “illness of the mind” as the 
Proper concept for describing the conduct of 
people who violate propriety norms (the mores 
of Sumner, 1906). Much of the undiagnosed con- 
fusion currently noted in the helping professions 
and in relevant juridical decisions is reflected in 
Ellis’s paper. Such confusion might be reduced if 
we looked at the metaphorical background of our 
constraining vocabularies. First, let us look at 
“illness.” 

The basic referent for illness and for synonyms 
such as sickness and disease is a stable one, 
extending over centuries, The referent is dis- 
comfort of some kind, such as aches, pains, 
cramps, chills, paralyses, and so on. The discom- 
fort is a self-appraisal through attention to un- 
usual proximal stimuli, that is, stimuli located 
“inside” the organism. These proximal stimuli, 
when they occur simultaneously with dysfunction 
of bodily organs, are the so-called symptoms of 
illness. A diagnosis of illness or disease meant 
not only that a person reported discomforts, but 
that the associated somatic dysfunction interfered 
with the performance of some of his customary 
roles, This general paradigm of sickness or illness 
is widespread and may be found in ancient writ- 
ings and in ethnographic reports. 

A compelling question arises: How did the 
concept “illness” come to include gross behavior, 
that is, misconduct, rather than complaints and 
somatic symptoms which were the defining cri- 
teria of pre-Renaissance diagnosis? What addi- 
tional criteria were employed to increase the 
breadth of the concept “illness”? 

The inclusion of behavior disorders in the 
concept “illness” did not come about suddenly or 
accidentally. Rather, the label “illness” was at 
first used as a metaphor and later transformed 
into a myth. 

The beginning of this metaphor-to-myth trans- 
formation may be located in the 16th Century. 
The demoniacal model of conduct disorders, codi- 
fied in the 15th Century Malleus Mallificarum, 
had embraced all conduct that departed from the 
existing norms and was policed by zealous church 
and secular authorities. The most outstanding 
result of this thought model was the Inquisition, 
a social movement that among other things influ- 
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enced the diagnosis and treatment of unusual 
imaginings, esoteric beliefs, and extraordinary 
conduct. The diagnosis of witchcraft and the 
prescription of treatment (burning) was the 
province of ecclesiastical specialists. 

The 16th Century witnessed the beginnings of 
a reaction against the excesses of the Inquisition. 
The beginnings of humanistic philosophy, the 
discovery and serious study of Galen and other 
classical writers, the renunciation of scholasti- 
cism—the whole thrust of the Renaissance was 
opposite that of the Inquisition. In this atmos- 
phere, Teresa of Avila, an outstanding figure of 
the Counter-Reformation, contributed to the 
shift from demons to “illness” as the cause of 
conduct disturbances. A group of nuns was ex- 
hibiting conduct which at a later date would have 
been called hysteria. By declaring these women 
to be infirm or ill, Teresa was able to fend off 
the Inquisition. However, the appeal that a diag- 
nosis should be changed from witchcraft to ill- 
ness required some cognitive elaboration. She 
invoked the notion of natural causes. Among the 
natural causes were (a) melancholy (Galenic 
humoral pathology), (b) weak imagination, and 
(c) drowsiness. If a person’s conduct could be 
accounted for by such natural causes, it was to 
be regarded not as evil, but comas enfermas, as 
if sick. By employing the metaphor “as if sick,” 
she implied that practitioners of physic rather 
than clergymen should be the responsible social 
specialists (Sarbin & Juhasz, 1967). $ 

When employing metaphorical expressions 
there is a common human tendency to drop the 
qualifying “as if” (Turbayne, 1960). That is to 
say, the metaphor is used without a qualifier to 
designate it as figurative rather than literal, In 
the case of illness as a metaphor for conditions 
not meeting the usual criteria of illness, the 
dropping of the “as if” was facilitated by the 
practitioners of physic. It was awkward for them 
to talk about two kinds of illness, “real” illness 
and “as if” illness. When Galenic classifications 
were reintroduced, the “as if” was dropped. Thus, 
post-Renaissance physicians could concern them- 
selves with illness as traditionally understood and 
also with norm violations as illness. A review of 
the 16th and 17th Century treatises on “physic 
reveals clearly that Galen’s humoral theory was 
the standard for diagnosis and treatment. The 
diagnostic problem was how to construct infer- 
ences about the balance of humors inside the 
organism. 14 

The decline of the power of church authorities 
in diagnosing extraordinary imaginings and per- 
plexing conduct was parallel to the rise of scl- 
ence. The prestige of the scientist helped in 
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SHOULD Some PEOPLE BE LABELED MENTALLY ILL? 


establishing the model of Galen for both kinds of 
“illness”—those with somatic complaints and ob- 
servable somatic symptoms and those without 
somatic complaints but with unusual behavior 
standing for somatic symptoms. 

Whereas the concept illness had been satisfied 
by the exclusive use of conjunctive criteria (com- 
plaints and observable somatic symptoms), it 
was now satisfied by the use of disjunctive cri- 
teria (complaints and somatic symptoms or com- 
plaints by others of perplexing, embarassing, 
mystifying conduct). As a result of the uncritical 
acceptance of the humoral pathology of Galen as 
the overriding explanation for both somatic and 
behavior disorders, the latter became assimilated 
to the former. That is to say, to meet the re- 
quirements of the basic Galenic model, symp- 
toms of disease had to be observed, so the ob- 
served behavior sequences were regarded as if 
they were the symptoms. Thus, the verbal report 
of strange imaginings on the one hand and fever 
on the other, were treated as belonging to the 
same class, that is, symptoms. As a result of 
shifting from a metaphoric to a literal interpre- 
tation of gross behavior as symptom, Galenic 
medicine embraced not only everything somatic 
but also all conduct, Now, any bit of behavior— 
laughing, crying, threatening, spitting, silence, 
imagining, lying, and believing—could be called 
Symptoms of underlying internal pathology. 

The basic Galenic model was not rejected by 
psychiatry or clinical psychology. Microbes, tox- 
ins, and growths, which were material and op- 
erated according to mechanical principles, were 
appropriate “causes” of diseases of the body. 
They were inside. The appropriate causes for 
abnormal behavior had to be sought on different 
dimensions. Since the mind-body conception was 
taken as truth, the hypothesis could be enter- 
tained that the causes of abnormal conduct were 
in the mind. If this were so, then the most ap- 
propriate label for such nonsomatic diseases 
Would be “mental illness.” 

_ Before considering the meaning of “mental” 
P the phrase “mental illness,” let me recapitulate. 

Illness,” as in mental illness, is an illicit trans- 
formation of a metaphorical concept to a literal 
one, To save unfortunate people from being 
labeled witches, it was humane to treat persons 
Who exhibited misconduct of certain kinds as if 
they were ill. The Galenic model facilitated the 
tliding of the hypothetical phrase, the “as if,” 
and the concept of illness was thus deformed to 
clude events that did not meet the original 
fuunctive criteria for illness. A second trans- 
ormation assured the validity of the Galenic 
Model. The mystifying behaviors could be treated 
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as if they were symptoms equivalent to somatic 
symptoms. By dropping the “as if” modifier, ob- 
served behavior could be interpreted as sympto- 
matic of underlying internal pathology. 

How did the notion of “illness of the mind” 
become so widely accepted that it served as the 
groundwork for several professions? A searching 
historical analysis makes clear that mind was 
originally employed as a metaphor to denote such 
events as remembering and thinking. (Colloquial 
English has retained this formulation, as in 
“mind your manners.”) The shift of meaning to 
that of a substantive or agency can best be un- 
derstood as another instance of metaphor-to- 
myth transformation (Ryle, 1948). 

The modern practitioner of Galenic psychiatry 
and psychology operates from the principle that 
the “illness” about which he is concerned is in 
the mind (or psyche, or psychic apparatus), But 
the mind, even for Galenic practitioners, was too 
abstract and undifferentiated a concept. 

Since the mind was invisible and immaterial, it 
could not have the same properties as the body— 
properties that could be denoted by physicalistic 
terms, Visual palpable organs being the com- 
ponents of the material body, what differentiat- 
ing components of the invisible impalpable men- 
tal entity could one discover or invent? A new 
metaphor was required—the metaphor of states 
of mind. States of love, fear, anxiety, apathy, 
etc. were invented to account for differences in 
observed conduct. The practitioner now had the 
job of discovering through chains of inferences 
which mental states were responsible for normal 
and abnormal conduct. 


MIND AS AN ORGAN OF ILLNESS 


Three developments contributed to the con- 
struction of mind as the repository of special 
states and as an organ that was subject to “ill- 
ness”: (1) the ready availability of dispositional 
terms, (2) the introduction of new terms of faith 
and religion that located religious experience 
“inside” the person, and (3) the development of 
a scientific lexicon. 

1. Dispositional terms are shorthand expres- 
sions for combinations or orderings of „distal 
and/or proximal events—in principle, a disposi- 
tional term can be reduced to a series of observ- 
able occurrences. For example, “bravery” implies 
a set of concrete behaviors under certain condi- 
tions. No implication is carried that the referent 
is an internal mental state. The development of 
dispositional terms, however, appears to, be a 
necessary (though not sufficient) prerequisite for 
the postulation of mental states. In time, dispo- 
sitional terms become elided and remote from 
the original metaphorical beginnings. 
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2. Dispositional terms were conveniently bor- 
rowed to denote religious conceptions which fol- 
lowed the shift from an emphasis on ritual and 
ceremony to inward, personal aspects of faith. 
Theologians and preachers gave a new set of 
referents to these dispositional terms, referents 
that changed dispositional terms from brief no- 
tations of observable conduct to states of the 
soul. The context in which mental states are 
employed is best expressed by the polarity 
inside-outside. The problem for the medieval 
thinker was to find a paradigm for locating events 
on the inside. Such a model could have been 
constructed from the following observations: 
Two classes of proximal inputs may be identi- 
fied. The first occurs in a context of distal events: 
for example, pain in the ankle occurs in a con- 
text of tripping over a curb; a burning irritation 
in the fingers occurs in the context of leaning on 
a hot: radiator. The second class of proximal 
inputs occurs in the absence of associated distal 
events, such as toothache, headache, gastritis, 
neuritis, etc. Since the antecedents of the latter 
inputs could not be located in the outside world, 
the locus of the somatic perception inside the 
body was taken as the causal locus. Medieval 
man had little reliable knowledge of anatomy 
save that there were bones, sinews, tubes, and 
fluids and there were also empty spaces. Under 
the authority of the priests, he acquired the 
belief that an immaterial and invisible soul re- 
sided in these otherwise empty spaces. On this 
belief system, events for which there were no ob- 
served distal contexts could be attributed to the 
workings of this inner entity or soul. Such an 
analysis probably prepared the way for locating 
dispositions inside the person and calling them 
states of mind. If the cause of an event had no 
obvious external locus, then it must have an 
internal locus. Dispositions, when they are codi- 
fied as substantives, tend to be treated in the 
same way as other nouns, as possessing “thing- 
ness.” Thus bravery, lust, conscience, purity, 
devotion—all dispositional terms originally tied 
to orderings of behavior—are framed as nouns. 
If nouns are names of things, and things have 
location, the problem emerged: where to locate 
the referents for these nouns? The answer is 
similar to the process of locating inside the 
person the cause of pain and discomfort in the 
absence of external occurrences. Thus, anger, joy, 
courage, happiness, etc., came to be located in 
the soul. 

3. The replacement of theologians by scientists 
in the 16th and 17th Centuries in matters per- 
taining to strange and mysterious conduct made 
necessary a shift from such theological terms as 
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“soul” to scientific metaphors. However, the sci- 
entists could not break completely with the en- 
trenched dualistic philosophy. They took as their 
point of departure the facts of thinking and 
knowing and, as a substitute for the soul, em- 
ployed mind as the organ for such activities. With 
the development of classical scholarship, Greek 
terms were substituted for the vernacular, the 
most popular being “psyche” (Boring, 1966). 
The efforts of the post-Renaissance Galenic 
practitioners, then, were directed toward ana- 
lyzing states of mind or psychic events, Those 
sequences of perplexing conduct that could not 
be related to external occurrences were declared 
to be outcomes of internal mental or psychic 
processes. 

Thus mental states—the objects of interest and 
study for the diagnostician of “mental illness’— 
were postulated to fill gaps in early knowledge, 
Through historical and linguistic processes, the 
construct was reified. Contemporary users of the 
mental-illness concept are guilty of illicitly shift- 
ing from metaphor to myth. Instead of main- 
taining the metaphorical rhetoric “it is as if 
there were states of mind,” and “it is as if some 
‘states of mind’ could be characterized as sick- 
ness,” the contemporary mentalist conducts much 
of his work as if he believes that minds are 
“real” entities and that, like bodies, they can be 
sick or healthy. 

The most potent implication of the metaphor 
is that persons labeled mentally ill are categorized 
as significantly discontinuous from persons labeled 
with the unmodified term “ill.” Of course, re- 
ferring to persons simply as ill or sick suggests 
that they belong to a class different from the 
mutually exclusive class “not ill” or “healthy.” 
Assigning persons to the class “ill” carries the 
meaning of objective symptoms of a recognized 
or named disease, in addition to subjectively ex- 
perienced discomfort. In most societies, persons 
so classified are temporarily excused from the 
performance of selected role obligations. The 
label carries no hint of negative valuation. Sick- 
ness, in general, is something for which one 15 
not responsible. a 

However, when the adjective “mental” -is 
prefixed, a whole new set of implications follows. 
Contrary to the humane intent of those who re- 
sisted the Inquisitors by employing the aah 
pejorative diagnostic label of illness, presen! 
usage is transparently pejorative. Fish 

In adding the word “mental” to “illness, i 
whole meaning structure changes. In the i 
place, the necessity for adding a prefix to 1 
ness” imposes a special constraint on the He 
terpreter: He asks, “What about this person 0! 
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his behavior calls for such a special designation?” 
Since it is a special kind of illness, does the same 
expectation hold that he (the patient) is to be 
temporarily excused from the enactment of his 
roles? 

The answers to these questions may be found 
in a number of studies (Cumming & Cumming, 
1962; Goffman, 1961; Nunnally, 1961; Phillips, 
1963). Persons who are labeled mentally ill are 
not regarded as merely sick; they are regarded 
as a special class of beings, to be feared or 
scorned, sometimes to be pitied, but nearly al- 
ways to be degraded. Coincident with such neg- 
ative valuations are the beliefs that such “men- 
tally ill” persons discharge obligations only of 
the most simple kinds. The author has elsewhere 
argued that the process whereby a person is con- 
verted into a mental patient carries with it the 
potential for self-devaluation. The stigmatiza- 
tion, then, may work in the nature of a self- 
fulfilling prophecy (Sarbin, 1967c). 

Further, because of the inherent vagueness in 
the concept of mind, its assumed independence 
from the body, and its purported timelessness 
(derived from the immortal soul), there is a 
teadiness to regard this special kind of sickness 
aS permanent. Thus, a person with a fractured 
wrist or a patient suffering from influenza, that 
is, a sick person, may take up his customary 
toles upon being restored to health. A person 
diagnosed as mentally ill, however, is stigmatized. 
Although “cured” of the behavior that initiated 
the sequence of social and political acts that 
resulted in his being classified as mentally ill, his 
Public will not usually accept such “cures” as 
Permanent. It is as if the mental states were 
capable of disguising the person as healthy, al- 
though the underlying mental illness remains in 
4 dormant or latent state. 

The pejorative connotation is an integral part 
of the concept. Ellis’s advice to subtract the 
moralistic overtones” is gratuitous. One can no 
More delete by fiat the valuational component 
from “mental illness” than eliminate the “pleas- 
antness” from the act of eating a preferred food. 

Another implication of the mental-illness con- 
cept stemming from the demonstrated utility of 
germ theory for nonmental illness is the internal 
causal locus of mental illness. But the shadowy 
Interior of the mind is not easily entered. The 
experts must depend on chains of inference forged 
Out of the verbal and nonverbal communications 
of Patients and informants, From such communi- 
rations, today’s experts draw conclusions about 

© mental structures, their dynamic properties, 
and their relation to observed behavior in the 
Same manner as Galenic practitioners drew con- 
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clusions about the distribution of the humors. 
One outcome of the exclusive verbal preoccupa- 
tion with psychic states is the neglect and avoid- 
ance of events in the social systems that might 
be antecedent to instances of misconduct illicitly 
and arbitrarily called symptoms. 

The heuristic implications of the mental-ill- 
ness metaphor are no less important than the 
practical implications, Scientists of many kinds 
have discovered the causes for many (nonmental) 
illnesses by looking inside the body. By adding a 
postulate that all mental states are caused by 
organic conditions (the somatopsychic hypothe- 
sis) and also accepting disordered conduct as 
symptomatic of underlying disease entities, the 
corollary follows that the ultimate causal agents 
will be discovered through searching for bio- 
chemical, toxicological, and bacteriological sub- 
strates, Again, such search methods deploy atten- 
tion and effort away from the social ecology as 
a possible source of antecedent conditions of 
misconduct. 


REJECTION OF THE MENTAL-ILLNESS CONCEPT 


The analysis offered so far supports the argu- 
ment that the label “mental illness” should be 
eliminated from our vocabulary. Following from 
the implications contained in the label, the 
logical arguments by themselves would predict 
the social discrimination and self-denigration 
consequent to the establishment of social insti- 
tutions to segregate, house, treat, manage, and 
reform norm violators. The tacit semantic rela- 
tion between sin (or evil) and mental illness 
(Crumpton, Weinstein, Acker, & Annis, 1967), 
as well as the juridically endorsed relation of 
mental illness to danger (Platt & Diamond, 1966; 
Sarbin, 1967a), also grows out of the label's 
implications. 

It is one thing to demonstrate that “mental 
illness” has achieved mythic status and that its 
continued employment stands in the way of 
developing policies and practices for meeting 
some important social problems; it is another 
thing to recognize that some people, sometimes, 
somewhere, engage in conduct that violates pro- 
priety norms, including norms controlling in- 
group aggression, Ellis is justifiably concerned 
with the problem of what disposition to make of 
these norm violators. His solution to the prob- 
lem is to label them mentally ill (sans “moral 
overtones”). Such labeling provides a warrant 
for segregating norm violators in mental hos- 
pitals or referring them to psychotherapists. The 
warrant contains (sometimes explicitly) the no- 
tion “for the patient’s own good.” The history 
of the mental hospital system and of the mental 
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health movement in America witnesses that “the 
patient’s good” is little more than a cliché ut- 
tered to offset the degradation and desocializa- 
tion outcomes, 

If my previous arguments are not footless, 
then Ellis’s recommendation that we continue 
the practice of labeling people mentally ill should 
be forcefully rejected. If his advice is rejected 
on the grounds of logic and of humanitarian 
values, then we are left with a gap in the social 
fabric. What should citizens and officials of an 
open society do about the problem of norm 
violation? What, if anything, should we do about 
people who are sometimes described as silly, 
unpredictably eccentric, perturbed, deviant, mute, 
shameless, rude, impertinent, immodest, dis- 
honest, childish, dangerous, hostile, aggressive, 
and so on? Current practice is, under some con- 
ditions, to regard the behavior described by such 
terms as symptoms of, or caused by, mental ill- 
ness. Ellis (1967) illustrates this point nicely. 
With impressive documentation, he says: 

In the last analysis, almost all neurosis and psy- 
chosis consists of some self-dishonesty. . . . When, 
therefore, one fully faces the fact that one is “men- 
tally ill,” that this is not a pleasant way to be, and 
that one is partially responsible for being so, one 
becomes at that very point, more honest with one- 
self and begins to get a little better [p. 440]. 
What function is served other than the imputa- 
tion of a discredited mental-state causality? More 
continuous with observation would be the sub- 
stitution of the word “dishonest” for “mentally 
ill.” 

In exposing mental illness as a myth that has 
outlived its usefulness, the label becomes im- 
proper and futile. Thus we are left with a far- 
reaching problem in jurisprudence, law enforce- 
ment, social engineering, and community psy- 
chology. The problem may be formulated as a 
question: What criteria should be employed to 
deprive a man of his liberty, his civil rights, his 
capacity for self-determinism, and so on? It 
would be foolhardy for me to try even to sug- 
gest answers in this brief paper. However, I can 
point to some partially charted areas that re- 
quire further exploration. 

All of us must put our heads together and 
decide how free and open a society we want. This 
decision is prerequisite for establishing criteria 
to identify those persons who should not be free. 
It is my belief that with increasing application of 
democratic principles the use of “mental illness” 
will be dropped as an intervening category be- 
tween overt conduct and juridically established 
status as free or restrained. The arguments of 
Szasz (1963); the observations of Goffman 
(1961); the historico-legal studies of Platt and 
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Diamond (1966); the persisting dissatisfaction 
with such legal precedents as McNaughten, Dur- 
ham, and others (Diamond, 1964; Dreher, 1967); 
the disillusionment of psychiatric and psychologi- 
cal practitioners with mentalistic and scholastic 
theories (Sarbin, 1964); and the development of 
community psychology (The Conference Com- 
mittee, 1966)—these and other forces are con- 
verging toward finding a fair and more efficient 
process for arresting, detaining, and incarcerating 
individuals whose public conduct violates current 
propriety norms. 

In this connection, we must confront the im- 
plications of a currently common practice of 
regarding deviant conduct (e.g., homosexuality) 
as equivalent to sickness. The refusal of an indi- 
vidual to accept the pejorative classification of 
mental illness and, correlatively, his refusal to 
enter psychotherapy, are taken as signs that he, 
according to Ellis, does not want “to improve his 
lot.” The careful work of Hooker (1957, 1958) 
suggests that in the culture of male homosexuals 
the distributions of conventionally used indi- 
cators of psychopathology (e.g., Rorschach varia- 
bles) are not substantially different from the dis- 
tributions of heterosexuals. Deviance from cul- 
tural norms is a societal problem. It is doubtful 
whether the use of the mental-illness label or any 
other epithet of degradation will contribute to 
the solution of the problem. I know of no evi- 
dence that supports the contrary notion that 
societal problems associated with cultural devi- 
ance are ameliorated by diagnosing deviant indi- 
viduals as “ill.” 

We turn our attention briefly to the problems 
in jurisprudence generated by the facts of norm 
violation and by the continued use of the mental- 
illness doctrine. A cursory review of legal 
treatises makes clear that the law, its writers, and 
interpreters, although deeply involved in the 
problems of equity and justice, have not been 
concerned with questioning the ontological status 
of mental illness. Such verdicts as “not guilty by 
reason of insanity” reflect the dualism upon 
which much of our jurisprudence rests, not to 
mention our theology and metaphysics. This 
verdict is not unlike many constructions to be 
found in legal treatises. The hidden metaphor 15 
this: It is as if there is a body and a mind nor- 
mally functioning in harmony. The body performs 
actions under the governance of the immateng 
invisible mind. Where the acts of the body an 
the intent of the mind are not in harmony fe 
meeting normative standards of conduct, beat 
nations in terms of rule-following models 2 
inadequate. Under these conditions a causal a 
planation is required: The mind is not propery 
controlling the body. Therefore the body 


= 


SHOULD Some PropLE BE LABELED MENTALLY IrL? 


declared “not guilty” and the mind becomes the 
object of punishment or retribution. The aim of 
such actions is to exorcise the evil influences or 
mental states that guided the body to perform 
improper or sinful acts. 

While I may be charged with unrestrained 
hyperbole, the historical facts are undeniable. The 
same cultural thought model that generated the 
medieval demoniacal model also produced the 
modern mental-illness model to explain conduct 
that does not meet rule-following prescriptions. 

The rejection of such an entrenched thought 
model by the relevant professionals is in the 
nature of a scientific revolution. As in all sci- 
entific revolutions, a new metaphor is needed to 
replace an exploded myth. The most likely can- 
didate for such replacement is a metaphor that 
denotes recent and current observations not con- 
vincingly assimilated into the older labels, Else- 
where, I have presented arguments in support of 
a new metaphor—the transformation of social 
identity—a metaphor that captures the ante- 
cedent and concurrent process of becoming a 
norm violator (Sarbin, 1967a, 1967b, 1967c; 
Sarbin, Scheibe, & Kroger, 1965). Because of 
Space limitations, I can say only that the meta- 
phor arises. from a comprehensive social theory— 
a theory that rejects mentalistic metaphors as 
being feebly inappropriate to the enormity of 
the theoretical and societal problems that con- 
front us, 

In these few pages I have tried to make the 
case that it is futile to try to support the propo- 
sition that some people be labeled “mentally ill.” 
The case stands or falls on the coherence of the 
ontological argument. My argument declares that 
the label is vacuous, save as an epithet of pejo- 
tation. Further, its scientific utility is suspect 
because of its reliance on an outworn mentalistic 
Concept—the ghost in the machine, to use Ryle’s 
(1948) apt metaphor. 
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DIFFERENCES BETWEEN NEGRO AND WHITE 
PREGNANT WOMEN ON THE MMPI* 


ROBERT H. HARRISON? anp EDWARD H. KASS 
Channing Laboratory, Harvard Medical School 


MMPI responses of 772 Negro and white lower class pregnant women were 
examined for race differences. 6 of 16 conventionally scored MMPI scales 
differentiated between the 2 groups at the .05 level. The absolute differences 
in means, while significant, were small. By contrast, 213 of the 550 items were 
significant at the .05 level. A factor analysis of the 150 most significant items 
produced 20 factors, each of which distinguished between the groups well 
beyond p < .001. Internal analysis of the conventionally scored scales revealed 
that approximately half of the race-significant items in each scale are scored 
in the Negro direction, the rest in the white direction. It is suggested that 
personality scales developed without regard to internal consistency criteria will 
often fail to detect differences between racial groups. 


Previous psychological investigations of 
personality differences between Negro and 
white Americans have revealed few replicable 
or striking differences between the two racial 
groups, The absence of striking differences in 
personality is surprising in view of what is 
known about the contrasting social environ- 
ments of the two groups. Negroes must con- 
tend with personal and economic discrimina- 
tion. More often than whites they appear to be 
brought up in a matriarchal family structure 
(Pettigrew, 1964). Their religious beliefs and 
institutions are often different from those of 
the white majority (Yinger, 1957). If social 
variables such as these influence the develop- 
ment of personality, and if presently avail- 
able tests are sensitive to personality differ- 
ences, substantial racial differences in person- 
ality test scores should be found regularly, 
yet these have not been consistently observed. 

Four explanations for this paradox suggest 
themselves: (a) uncontrolled variables af- 
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HD 01288 from the National Institutes of Health, 
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the initial planning; and to Sheila Hodge, Barbara 
Eidam, and Patricia O'Shea, who were chiefly re- 
sponsible for collecting the data. Claire Finnegan 
and Olga Ulchak provided valuable nursing care. 
Thanks are also due to Beverly Lee and David Drew 
of the Harvard Computing Center for their pro- 
gramming assistance, and to Theodore Colton and 
Kailie Uong for their advice and assistance in the 
statistical analyses. 

2 Now at Boston University. 


fecting the samples to be compared have 
minimized mean differences and maximized 
variability within each group; (b) differences 
may not be marked in the areas of adjust- 
ment and psychopathology, which have been 
the focuses of most previous research; (c) 
most of the studies have been done on the 
middle class Negro and his white counterpart, 
and there is reason to believe that middle 
class Negroes minimize their differences from 
whites (Pettigrew, 1964); and (d) the tests 
used in the various studies may not be suffi- 
ciently sensitive. Each of these explanations 
deserves further scrutiny, 

Most studies reviewed by Klineberg 
(1944), Dreger and Miller (1960), and Mc- 
Donald and Gynther (1963) were controlled 
for age and sex. Social-class variables have 
been relatively well controlled. McDonald 
and Gynther’s (1963) parametric (Sex X 
Class X Race) study of high school seniors 
in a segregated school system, with the 
MMPI as a test instrument, found that 
social-class differences were insignificant rela- 
tive to racial differences. In general, IQ and 
education have not been controlled, and their 
effects on most personality tests are unknown. 
However, it seems likely that their control 
would reduce rather than amplify racial dif- 
ferences in test scores. A further problem 11 
subject selection is that most studies r 
been done on samples of high school os col 
lege students, on patients, or on prisoners. 
Karon’s (1958) study using the Tote 
Horn Picture Arrangement Test is the only 
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study in the literature which used representa- 
tive sampling procedures. The consideration 
of the problem of sampling restricts the gen- 
erality of the differences which have been 
found. 

The second alternative considered above is 
that racial differences may not be marked in 
the areas of adjustment and psychopathology. 
Replicable but moderate differences have 
been found on the MMPI, as summarized 
in Table 1, These studies indicate that 
Negroes are more elevated than whites (in 
at least three of nine studies) on L, F, Hs, 
Sc, and Ma scales and less elevated (in three 
studies) than whites on Hy. The Bernreuter 
` Personality Inventory has shown little promise 
of being able to differentiate between the 
groups, and similarly limited differentiation 
has been achieved using the Bell Adjustment 
Inventory (Dreger & Miller, 1960). Outside 
of the adjustment area, only Mussen’s (1953) 
work on lower class boys has shown racial 
differences of the expected magnitude. He 
found that 14 of 50 TAT content analysis 
categories were related to racial differences. 
Karon (1958), using the Tompkins-Horn 
Picture Arrangement Test (PAT), found that 
only 13 of 150 PAT scales reliably differ- 
entiated southern Negroes from whites. Even 
more unimpressive and scarcely consistent 
findings have appeared in work with the 
Rosenzweig Picture Frustration Test (Dreger 
& Miller, 1960). Work with the Allport- 
Vernon-Lindzey Study of Values, using col- 
lege students, has produced a minimum of 
differences (Dreger & Miller, 1960). Racial 
differences in nonadjustment areas of person- 
ality are no more striking than in the adjust- 
ment area. 

Pettigrew (1964) suggested that middle 
class Negroes attempt to minimize differences 
between themselves and their white middle 
class counterparts. These pressures are not 
so great among lower class Negroes. Since 
all of the studies using the Bernreuter Per- 
sonality Inventory, the Bell Adjustment In- 
ventory, and the Allport-Vernon-Lindzey 
Study of Values have involved college stu- 
dents as subjects, it is hardly surprising that 
no racially specific differences have been 
found. The MMPI studies have used lower 
class and lower middle class samples (prison- 
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ers and patients) and have produced moder- 
ate differences. Mussen’s (1953) study with 
lower class boys produced similar degrees of 
difference. Negative evidence for Pettigrew’s 
(1964) hypothesis, however, is McDonald and 
Gynther’s (1963) failure to find a significant 
Race X Class interaction on most of the 
MMPI scales in their study. Education was 
held constant, however, and the observed 
class differences were among the parents 
and not among the subjects themselves. The 
hypothesis that middle class Negroes tend to 
minimize differences between themselves and 
white middle class counterparts offers at 
least a partial explanation of the lack of 
striking differences between Negroes and 
whites that has been found in psychological 
tests. 

Finally, it is possible that the tests are 
not sensitive enough. Either the items them- 
selves may be insensitive, or discriminating 
items may be combined in scales in such a 
way as to cancel each other’s effects, produc- 
ing nondiscriminating scales. Negroes and 
whites could vary considerably in the way 
they achieved similar test scores. This criti- 
cism is particularly applicable to tests such as 
the MMPI, whose scales were developed with 
no regard to internal consistency criteria. To 
find out whether the scales only or both the 
scales and the items are insensitive to race 
differences, one has to do an item analysis. 
Few item analyses are reported in the litera- 
ture. None of these item studies examined 
the possibility that the two groups obtain 
close to identical test scores in different 
ways. 

The present study was designed to test 
the hypothesis that there are racial differ- 
ences in personality and to meet objections 
that were raised in connection with previous 
work in the following ways: i 

1. Subjects were selected from successive 
admissions of pregnant women to a city 
hospital prenatal clinic. While the cone 
sions will be limited to lower class pregnan 
females, behavioral self-selection was close ie 
minimum, in that the hospital and clinic ie, 
exclusively the lowest socioeconomic cla 
Because all deliveries are on a nonpayment 
basis, without private care, it can be z 
sumed that observable differences are largely 
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TABLE 2 
Means, STANDARD DEVIATIONS, AND DIFFERENCES ON Formar MMPI SCALES 
i Negro White 
Variable Difference t 
M SD M SD 
? 8.91 12.22 6.17 9.37 +2.74 
i; y A ANEY hes 
L 4.95 2.56 4.85 2.28 +0.10 0.57 
F 5.95 4.03 5.35 3.61 +0.60 2.23* 
K 13.66 5.20 13.89 5.10 —0.23 0.62 
Hs 10.70 5.26 9.81 5.26 +0.89 2:35" 
D 24.82 5.40 24.78 5.67 +0.04 0.10 
Hy 23.08 6,18 23.72 6.00 —0.70 1.47 
Pd 17.96 4,86 17.85 4,98 +0.11 0.28 
Mf 34.05 4.16 34.08 4.10 —0.03 0.10 
Pa 9.53 3.85 9.32 3.39 -+0.21 0.80 
Pt 15.52 7.89 15.03 8.03 +0.49 0.87 
Sc 15.18 8.88 13.66 8.94 +1.52 2,36* 
Ma 16.69 4.42 15.60 4.69 +1.09 3.32** 
Si 31.47 7.76 32.50 8.74 —1.03 1.74 
Es 38.87 5.86 40.42 6.30 —1,55 3.597" 
MA 18.62 7.96 19.02 8.69 —0,40 0.66 
Discriminant —6.92 3.04 —8.97 2.90 +2.06 

function 

e 

“p $ 00i. 


independent of socioeconomic class differ- 
ences, 

2. If. Pettigrew’s hypothesis is correct, dif- 
ferences should be at a maximum in a lower 
class group such as this one. 

3. The MMPI contains 194 items which 
are not on any of the clinical scales. Many of 
these are related in manifest content to 
topics having no obvious relevance to ad- 
Justment, 

4. Item analyses were conducted, and the 
Possibility that different racial groups have 
different ways of obtaining approximately 
€qual scale scores was examined. 


METHOD 
The data were collected in the course of a study 
on predictive relationships of psychological and 
Physiological variables on the outcome of preg- 
nancy, that is, prematurity, complications of preg- 
nancy, and complications of delivery. 


Patients 

The total sample on which data were collected 
consisted of 1,038 women in their 20-24th week of 
Pregnancy. They were selected as successive ad- 
missions to the Prenatal Clinic at the Boston City 
Hospital between July 17, 1961 and December 
30, 1963. Each patient registering at the clinic who 
Was 20 weeks pregnant or less was given an ap- 


pointment to return during the 20-24th week. At this 
time, in addition to the usual obstetrical examina- 
tion, the patient was asked to supply a specimen 
of urine for bacteriological culture and was given 
an appointment to return for the MMPI. At the 
conclusion of the MMPI, the patient received a 
measured dose of radiolabeled iodoalbumin for 
determination of plasma volume, total blood volume, 
red blood cell mass, and hematocrit. The patients 
were informed that the blood volume procedures 
were to be conducted only after the psychological 
testing had been completed. 

Of the total of 1,038 women who were seen be- 
fore the 20th week of pregnancy, only two patients 
in the consecutive series refused to take the MMPI, 
one a frankly schizoid individual and the other a 
young unmarried mother whose parents refused to 
allow her to participate. Of the 1,036 remaining 
women, 264 were excluded from the final analysis 
of the data, leaving 772. The exclusions were on the 
basis of the following successively applied criteria: 
(a) some physiological data missing, usually be- 
cause of technical problems associated with with- 
drawal and processing of blood (118) ; (b) termina- 
tion of pregnancy with spontaneous abortion, still- 
birth, elective cesarean delivery, or twin delivery 
(71); (c) report of delivery at another hospital 
unobtainable (3); (d) illiteracy—the MMPI had to 
be read to these patients—(47); and (e) MMPI F 
scale over 90 (25). Of the 72 patients rejected on 
MMPI criteria, 57 were Negro and 15 were white. 
In the final sample, 389 patients were white and 


383 patients were Negro. 
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TABLE 3 
MMPI Irems DIFFERENTIATING BETWEEN NEGROES AND WHITES 
Significance level T F 

p <.000001 | 98, 415, 469, 490, 213, 11, 58, 173, 53, 557, 25, 477, 73, | 419, 30 
538, 229, 400, 429, 378, 498, 546, 392, 284, 558, 456, 
457, 4 

p < .00001 482, 280, 511, 206, 93, 513, 165 9, 468, 530, 532 

p < .0001 420, 552, 164, 237, 476, 426, 78, 202, 364, 279 238, 446, 116 

p <.001 449, 129, 79, 167, 125, 231, 132, 147, 136, 428, 15, 473, 486, 133, 64, 304, 74, 38, 496, 
84, 521, 241, 239, 502, 264, 386, 527, 59, 46, 556, 406, 376 
184, 67, 70, 525, 423 

b <.01 354, 510, 432, 437, 89, 566, 44, 472, 407, 526, 298, 297, 540, 450, 531, 225, 282, 506, 
433, 106, 69, 425, 114, 199, 275, 545, 340, 416, 124, 464, 439, 563, 522, 246, 26, 
16, 24, 404, 485, 3, 380, 319, 436, 150, 385, 80 255, 322, 81," 22, 285, 396, 

471, 524, 34 

p <.05 349,» 480, 218,» 101, 344,> 27, 178, 157, 221, 393, 252, | 481, 160,> 474, 208, 561, 171, 
348, 523, 352, 166, 72, 283, 117, 66, 417, 212, 265, 261, 103, 223, 145, 140, 18, 542, 
453, 120, 163, 226, 35, 341, 395, 200, 7, 384, 222, 410, 361, 347, 421, 214, 193, 105, 
547, 87, 121, 320, 57, 161, 299, 293, 170, 483, 412, 92, 39 
324 


Note.—An item appears in the T column if it is answered “ ore fre y 
column if it is answered “False” more frequently by Negroes than by whites; items are listed 


within each significance level. 


True” more frequently by Mp Hic than by whites and in the F 


in decreasing order of significance 


à These items were excluded from the factor analysis on the basis that they exceeded a 95-5 split between T and F, 


b These items were included in the factor analysis, 


Differences in socioeconomic status between the 
Negro and white groups were not measured. How- 
ever, since the entire sample came from the socio- 
economically underprivileged neighborhoods sur- 
rounding Boston City Hospital, these differences 
cannot have been great. Education was not formally 
measured. More Negroes than whites had to be 
excluded by the illiteracy and F scale criteria, how- 
ever. 


Procedure 


The card form of the MMPI was administered 
individually, with the administrator either present 
in the patient’s cubicle or close at hand. Use of the 
? category was allowed, but if over 15% of the 
patient’s responses fell in that category, she was 
asked to sort the category again. Her final sort was 
regarded as valid, regardless of whether the 15% 
criterion had been met. The 4 validity scales, the 
10 standard clinical scales, Barron’s Ego Strength 
(Bs) scale, and the Taylor Manifest Anxiety (MA) 
scale were scored and punched on IBM cards. The 
K correction was not made, The patient’s response to 
individual items was also recorded and punched 
on IBM cards as deviant, normal, or ?. 


RESULTS 
Differences on Scales, Items, and Factors 


Scale scores. Table 2 presents the Means, 
standard deviations, and ¢ tests for the 16 


scales. The Negro group was significantly 
higher than the white group on ?, F, Hs, Sc, 
and Ma and lower than the white group on 
Es, These findings are in agreement with the 
previous literature cited in Table 1. 

Item analysis. Chi-square was computed for 
each item against the race criterion, using 
Yates’ correction and observing conventional 
restrictions on minimum expected frequencies. 
For each computation the ? category was 
ignored. Of the 550 items, 213 discriminated 
between Negroes and whites at the .05 level 
of significance. They are listed by level of 
significance in Table 3. Thus, differences 
based on the racial grouping overshadowed 
those we had set out to study. 

Factor analysis. Grouping the items con- 
ceptually on the basis of subjective judgment 
has obvious pitfalls, For this reason, the 150 
most significant items were factor analyzed 
in order to get a more objective basis for con- 
ceptualizing. The tetrachoric correlations be- 
tween the items were computed (? was clas- 
sified as a deviant response) and then rks 
analyzed by the principal-components metho 
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TABLE 4 
SUMMARY or FACTOR ANALYSIS 


Content 


Estrangement (N).* 26 items describing hopelessness, unhappiness, guilt, and self-devaluing 
Intellectual and cultural interests (N). 21 items expressing interest in reading, studying, thinking 
Denial of major symptoms (W). Descriptions of unusual bodily states and experiences are denied in 
Cynicism (N). 26 items describing disillusionment with the human race in which the respondent 
Admission of minor faults (W). 9 items admitting activities (petty thievery as a youth, playing 
Romantic interest (N). These 9 items describe enjoyment of movies, novels, and discussions with 
Somatic tension (N). These 9 items describe pains in the stomach and head, as well as general 
Impulse-ridden fantasy (N). 12 items combining attraction to sexual and aggressive activities’com- 
bined with reality-based inhibition. They have a voyeuristic, vicarious flavor. 
? (N). A miscellany of 7 items which suggest rigidly moralistic attitudes and auditory hallucinations, 
? (N). 4 items suggesting ability to imagine pleasant events and deny the possibility of unpleasant 


Dislike of school (W), 6items reporting bad deportment, playing hookey, disinterest in school. 
Religiousness (N). 19 items describing conservative religious beliefs and practices, condemning in- 


Masochism versus sadism (W). 5 items denying sadistic impulses and expressing willingness to 
Compulsive orderliness (N). 7 items describing an abhorrence of physical or intellectual sloppiness 


Fearfulness (N). Many of the common phobias (dark, lightning, spiders) are included in these 


Factor No. 
I 
schizoid thinking. 
II 
about, and discussing various topics. 
TI 
most of the 10 items. 
IV 
prefers “honesty” to “illusion,” 
Vv 
hookey) subject to mild social censure. 
VI 
romantic, mildly sexual themes. 
VII 
tension. 
VIII 
IX 
X 
events, 
XI 
XII 
dulgence in sex, alcohol, and smoking. 
XIII 
suffer for the common good. 
XIV 
and a desire to set things straight. 
XV 
8 items. 
XVI ? (W). 4 highly diverse items, 
XVII 
XVIII 
XIX 
XX 


Dream concern (N). 2items on frequent dreaming, and 2 suggesting embarrassment, : 
Self-consciousness (W). 11 items describing self-conscious embarrassment in group situations. 
? (N). 7 diverse items suggesting honesty about oneself and one’s world. 

Group sociability (N). 6 items suggesting enjoyment of large groups and strangers. 


Note.—A list of the specific items for each factor may be obtained from the senior author; only items with factor loadings 


of 30 or greater were considered to belong to a factor. 


^N indicates that Negroes were higher on a factor than were whites; W indicates 


With 1.00 in the diagonals, Although 50 fac- 
tors with latent roots over 1.00 were ex- 
tracted, following Cooley and Lohnes’ (1962) 
recommendations, only the first 20 factors 
were preserved for subsequent rotation.® 

hese 20 factors accounted for 54% of the 
‘otal variance in the matrix and were rotated 
by Kaiser’s Varimax criterion to orthogonal 
Simple structure. From the Varimax rotation, 
16 of the 20 factors could be named without 
Undue conceptual strain. Table 4 presents a 
summary of the item content for each factor. 
actor scores were then computed for each 


i “Limitations of the Varimax rotation prog 
i ailable at the Harvard Computing Center re- 
“ected our selection to 20 factors. 


the reverse. 


subject by the direct method outlined by 
Harman (1960).* 

Factor score differences. Table 5 presents 
the means, standard deviations, and ¢ tests 
for the two racial groups on the factors. In 
decreasing order of importance for racial 
differences, Negroes reported themselves as 
more religious, intellectual, romantic, cynical, 
impulsive in fantasy, fearful, estranged, 
sociable, concerned with dreams, orderly, and 
somatically tense than whites and less 
masochistic, free of aberrant behavior, in- 
dulgent in minor crimes, self-conscious, and 
antagonistic toward school than whites. 


4The model in which communalities are fixed at 
1.00 was used for computation of factor scores. 
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TABLE 5 
Means, STANDARD DEVIATIONS, AND DIFFERENCES 1N FACTOR SCORES 
Negro White 
Factor Difference pe 
M SD M SD 
I Estrangement +2.49 9.17 —2.69 8.07 +5.18 8.33 
II Intellectual and +2.92 6,48 —3.04 5.37 +5.96 13.91 
cultural interests 
III Denial of major —1.82 4.76 +1.75 3.81 23.57 11.48 
S toms 
IV cynics +3.24 9,32 —3.52 7.83 +6.76 10.91 
V Admission of minor —1.03 3.67 +1.01 3.67 —2.04 7.78 
faults 
VI Romantic interest +1.38 3.42 —1.47 2.74 +2.85 12.81 
VII Somatic tension +0.61 3.99 —0.71 3.48 +1.32 4,93 
VIII Impulse-ridden +1.58 5.14 —1.67 3.65 3.25 10.13 
fantasy 
IX (?) +0.75 3.07 —0.83 2.49 +1.58 7.81 
X (?) +0.43 2.42 —0.53 2.05 +0.96 5.89 
XI Dislike of school —0.60 2.64 +0.46 2.94 —1.06 5.26 
XII Religiousness +3.52 6.62 —3.62 4.65 +7.14 17.35 
XIII Masochism versus —0.85 2.10 +0.91 1.88 —1.76 12.31 
sadism 
XIV Compulsive +0.46 2.38 —0.52 2.17 +0.98 5.96 
orderliness 
XV Fearfulness +0.98 3.01 —0.98 2.91 +1.96 9.22 
XVI () —0,82 2.56 +0.70 2.06 —1.52 9,08 
XVII Dream concern +0.64 3.19 —0.70 3.08 +1.34 5.97 
XVIII Self-consciousness —0,94 4.53 +0.86 4.56 —1.80 5,50 
XIX (?) +0.86 2.94 —0.96 2.54 +1.82 9.23 
XX Group sociability +0.80 3.46 —0.78 3.38 +1.58 6.40 
Discriminant function +2.64 2.85 —2.59 3.11 +5.22 


** > < .001 for all £ 


The pattern of correlations between factor 
scores and the nature of the trait labels 
suggested that the Negroes are more anxious 
in their thoughts (Factor Nos. I, III, VII, 
XIV, XV) and less anxious socially than 
whites (Nos. XI, XVIII, XX). Negroes pre- 
sented themselves as being less inclined to act 
out destructive impulses than whites (Nos. V, 
IX, XIX), while acknowledging the presence 
of these impulses on a fantasy level (Nos. 
IV, VIII, XIII). They also appeared more 
introverted and romantic (Nos. II, VI, VIII, 
TX) and more religious (No. XIT). 


Group Separation 


Multiple discriminant analysis was used 
to further clarify two issues: (a) What 
variables contribute the most (indirectly as 
well as directly) to a discriminant function 
which maximizes the separation between the 


two groups? (b) What is the maximum 
degree of separation possible using scale 
scores and using factor scores? k 
Table 6 summarizes the results of dis- 
criminant analysis using the MMPI scale 
scores as independent variables. The un- 
variate analysis stresses Es, ?, Ma, Sc, Hs; F, 
Si, and Hy in decreasing order of importance 
for race differences. In contrast, the multi- 
variate analysis stresses Hy, Hs, MA, Es, Si, 
D, K, and ? in decreasing order of TROE 
for this distinction and suggests that i i 
profile of scores on these scales is more use i 
in separating the two groups than scale scor 
taken one at a time. 5 
Table 7 summarizes the corresponding 
analysis for factor scores. Since the faci if 
scores are correlated (whereas the factors : 
not), there is less than a perfect correspo! i 
ence between the results of the univana 
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TABLE 6 
DISCRIMINANT ANALYSIS OF SCALE SCORES 
MMPI | Normalized Scaled Relative 

scale vector® vector> | importance 
? +.0802 24.24 8 
L +.0534 3.60 15 
F +.0410 4.34 14 
K +.2044 29,21¢ 7 
Hs +.5789 84.54 2 
D +.2029 31.16 6 
Hy —.5538 —93.56 1 
Pd —.0096 =10.13 16 
Mf +.1803 20,69¢ 9 
Pa +.0753 7.59 13 
Pt +.0407 8.99 12 
Sc +.0717 17.33 11 
Ma +.1562 19.58 10 
Si —.1497 —34,36 5 
Es —.2419 —40.90 4 
MA —.3374 —78,03 3 


^ These are weights for the discriminant equation; the F ratio 
for the discriminant function is 5.66 (df = 16/755, p < .001). 

» These are the normalized vectors multiplied by the square 
root of the corresponding pooled within-group variances. 

° These variables are suppressors; their weighting in the 
discriminant equation is opposite to that indicated by the 
difference between means, 


tests and the weighting of the variables in the 
multivariate discriminant function. While the 


univariate analysis stresses _religiousness, 
intellectuality, romanticism, cynicism, and an 


TABLE 7 
DISCRIMINANT ANALYSIS OF FACTOR SCORES 
Normalized | Scaled Relative 
Factor vector® vector? | importance 
I Estrangement +.2506 | +60.14 
II Intellectual 10853 | —14.07¢ 4 
interests 
II Denial of major | —.1701 | —20.41 7 
symptoms 
IV Cynicism 0463 | +11.07 16 
Admission of Taio | “Hs A 
minor faults 
Wi Romanticinterest| +.2855 | +24.55 2 
Vill Somatic tension | —.1581 | —16.44¢ 1 
II Impulsive fantasy] —.3512 | —43.55° 
1U i za =12.97e | 15 
xX? Aesi atari] aao 
gi pulite of school | —.3751 TEI $ 
eligiousness 11580 , 
XII Masochism and | 227 rate K 
x sadism 
XIV Compulsive 0793, | — 499°) 17 
xy p,ordetliness 
SAY Fearfutness +.2446 | +20.06 3 
ai? = 13581 | —22.92 
Svat Dream concern | —0395 | — 3.440] 18 
XI Self-consciousness| — (2474 | —31.17 
IX ? 4.3340 25.28 
Group sociability} -+0340 3.23 19 


fie fie88, are weights for the discriminant equation; the F 
ratto for discriminant function is 28,29 (df = 20/151, p < 001). 
Toot of the wwe the foreman echoes multiplied by the square 

iriltse variables are suppressers; thelr weighting in the dis- 
minant equation is opposite to that indicated by the differ- 

between means. 
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TABLE 8 
CLASSIFICATION OF GROUPS FROM 
MMPI Scare Scores 
Actual 
Predicted race CaS ABER 
group Total 
Negro White 
Negro 233 132 365 
White 150 257 407 
Total 383 389 


Note.—x? = 54,96, 


impulse-ridden fantasy life, in the multivariate 
analysis the most important factors are 
estrangement, impulse-ridden fantasy, self- 
consciousness, dislike of school, religiousness, 
and romantic interests. 

Using the discriminant functions outlined 
in Tables 6 and 7, it is possible to use each 
function to classify subjects as Negro or 
white on the basis of their test scores, The 
formulae and computer program given by 
Cooley and Lohnes (1962) take into account 
the standard deviation of the discriminant 
function scores of each group as well as the 
size (antecedent probability of membership) 
and computes a probability value of member- 
ship in each group for each subject. Table 8 
shows that using the discriminant function 
based on MMPI scale scores, 61% of the 
Negro and 66% of the white subjects can be 
correctly classified; the overall hit rate is 
63%. Using the factor scores, Table 9 shows 
that 80% of the Negro group and 86% of 
the white group can be correctly classified, 
The overall hit rate (83%) is considerably 
higher than that from scale scores. 


TABLE 9 


CLASSIFICATION OF GROUPS FROM 
FACTOR Scores 


Actual race group 


Note.—x? = 327.31. 
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Discussion 


The main result of this study is that in 
our sample of pregnant women, race differ- 
ences in both MMPI items and derived fac- 
tor scales are of a magnitude seldom found 
in personality research. Whether or not the 
self-reports in the context of a white ad- 
ministrator in a white-dominated city hos- 
pital correspond with the self-report and 
“actual” behavior of these patients in other 
contexts is a matter for future investigation. 

Why are these results of such different 
magnitude than others in the personality test 
literature? It is probable that results of this 
order could have been hidden in previous data 
by the exclusive use of scale scores rather 
than that of item responses. The scale-score 
differences reported here are smaller than 
those reported in previous studies and are sig- 
nificant only because of the large size of the 
sample. Nonetheless, race differences on the 
items were huge. The scales are not very 
sensitive to race differences, whereas the items 
are remarkably sensitive. A canceling-out 
process must be at work in each scale. If so, 
each scale should have approximately equal 
numbers of (significant) Negro-favored and 
white-favored items. This is indeed the case, 
as is seen in Table 10. Table 10 also reveals 
that 32% of the discriminating items appear 
on none of the clinical scales, suggesting sub- 
stantial race differences in the nonadjustment 
areas of personality as well as in the adjust- 
ment area, It is also possible that the whites 
and Negroes have different ways of getting 
high scores on a scale. To test this hypo- 
thesis, items in each scale were ranked (for 
Negroes and whites separately) from those 
most frequently endorsed in the pathological 
direction to those least frequently endorsed. 
Spearman rank-order correlations were com- 
puted between Negro and white rankings of 
items belonging to each scale. These are re- 
ported in the right-hand column of Table 10. 
The agreement between the groups, while not 
complete, is generally high, suggesting that 
the two groups have similar ways of ob- 
taining high scores on the various clinical 
scales. 

Tn any case, the above analyses suggest 
that scales based solely on empirical item 
validity (ignoring internal consistency cri- 
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TABLE 10 
AGREEMENT BETWEEN Groups ON MMPI Scares 
No. No. 
Scale significant | significant | Spearman 
items favoredjitems favored r. 
by Negroes | by whites 
L 5 4 95 
F 14 4 83 
K 4 5 90 
Hs 7 3 91 
D 10 9 93 
Hy 7 13 93 
Pd 9 5 94 
Mf 17 14 2 
a 15 6 95 
Pt 7 7 93 
Se 19 5 92 
Ma 12 5 86 
St 11 15 91 
None of the 43 26 
above 
scales 


a Spearman r is between rank of endorsement by Negroes and 
rank of endorsement by whites. ‘ 

b The items in this row are those race-significant items which 
do not appear in the scales scored in this study. They are 
separated according to those which were answered in the deviant 
direction by Negroes (1st column) and those which were 
answered in the deviant direction by whites (2nd column). 


| 
| 
teria completely) are not likely to discrimi- 
nate well between groups defined by other 
than the validation criteria. For example, the 
D scale may be excellent for identifying de- 
pressives, but irrelevant for distinguishing 
whites from Negroes. Component factors of 
the D scale may, on the other hand, distin- 
guish well between racial groups and still be 
useful in identifying depressives. | 
Alternate explanations to those advanced | 
here must be considered. It may be that our 
subject selection was biased by socioeconomic 
and educational differences between the 
groups, producing many more differences 
than would have been found had such vari- 
ables been controlled. The fact that no race 
differences were found on the K scale, v 
has been suggested by Dahlstrom and Wel | 
i 


(1960) to be an MMPI correlate of socio- 
economic status, would indicate that an 
ences in socioeconomic status between he 
Negroes and whites were not large 1 t a 
study. McDonald and Gynther (1963) foun 


that when education was held constant, 
differences had no impact on MMPI scale 
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scores. The situation, however, may be differ- 
ent in the case of item analysis. Socioeco- 
nomic status may be a variable requiring 
more rigorous investigation. Educational dif- 
ferences may also be confounded with the 
race differences discovered. Many more Ne- 
groes than whites in our sample were elimi- 
nated on the basis of functional illiteracy. A 
residue of educational differences undetected 
by our rough criteria could be present in the 
sample whose data were analyzed. 

Another explanation for the magnitude of 
the differences found is the possibility that 
the state of being pregnant amplified race 
differences and minimized similarities, There 
may be a Negro way of reacting to pregnancy 
and a white way of reacting, This would re- 
quire that the differences in personality sub- 
side after the termination of pregnancy. The 
fact that the formal scale results replicate 
previous results very closely would cast doubt 
on this explanation for the scale differences, 
although it cannot be ruled out as an ex- 
planation for the item differences. 

The discovery of large differences between 
racial groups has emotional implications for 
many persons. Typically, studies of race dif- 
ferences have been conducted for many 
reasons, such as: (a) to prove or disprove 
the superiority of one group and/or the 
“place in society” of another (e.g., studies of 
intelligence and of admission rates to mental 
institutions); (b) to prove that one group 
has been suffering psychologically at the 
hands of another (e.g., Kardiner & Ovesey, 
1951); and (c) to describe differences in order 
to identify possible areas of misunderstanding 
and/or fruitful interchange between the 
groups. The discovery of large group dif- 
ferences stimulates a number of fears: (a) 
that one group is inferior to the other, (b) 
that the differences found will prick our con- 
Sciences, or (c) that these differences may be 
Mreconcilable. For many, therefore, a strong 
Preference for ignoring or minimizing differ- 
ences exists, The results of the present study 
indicate that a high degree of differentiation 
of Negro from white is possible on the basis 
of pencil-and-paper test responses alone. The 
Tecognition of the nature and extent of these 

ifferences is perhaps a first step in their 
Understanding, 
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RELATION OF MOOD AND HYPNOTIZABILITY: 


AN ILLUSTRATION OF THE IMPORTANCE OF THE STATE 
VERSUS TRAIT DISTINCTION + 


MARVIN ZUCKERMAN, HAROLD PERSKY,? ann KATHRYN LINK 
Albert Einstein Medical Center, Philadelphia, Pennsylvania 


The study was designed to test the hypothesis that affect states just prior to 
hypnosis induction are related to subsequent hypnotizability, while affect traits 
are not so related. The Multiple Affect Adjective Checklist (MAACL), an 
affect-state test, was given to Ss just prior to hypnosis. MMPI affect-trait 
measures were given after hypnosis. Hostility state (MAACL) was significantly 
and negatively correlated with hypnotizability in 3 runs of Ss tested in small, 
highly motivated groups. Depression state was negatively correlated with 
hypnotizability in 2 of the runs, and anxiety state was negatively correlated 
in 1 of the runs. Affect-state measures were unrelated to hypnotizability in 
a large, less motivated group. Affect-trait measures, as well as other trait 
measures from the MMPI, were unrelated to hypnotizability. The results show 
the importance of the state versus trait distinction in the prediction of 


hypnotic behavior and have implications for other areas of prediction. 


Hypnotizability is a reliably measured trait 
showing marked individual differences. Begin- 
ning with Charcot’s (1882) hypothesis of a 
relationship between hypnotizability and the 
hysterical personality type, there have been 
repeated attempts to define personality 
traits associated with hypnotizability. Barber 
(1964) and Hilgard (1965) have thoroughly 
reviewed this area of investigation. The his- 
tory of personality and hypnotizability rela- 
tionships is one of initially promising results 
which disappear during attempts at replica- 
tion. There are two explanations for this case 
of the vanishing correlation coefficients. The 
first is that the initial relationships found 
were due to chance. Since some investigators 
used personality inventories containing many 
scales, the low significant coefficients, or ¢ 
ratios, may have represented chance effects. 
A second explanation is that the relationships 
were found only in certain groups, because 
some particular sets of conditions existed 
during the hypnotic testing of these groups 
but not during the replication. 

Hilgard (1965) presented five replicated 
results from studies using paper-and-pencil 
techniques: 

1This work was supported by United States 
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1. Subjects’ self-predictions of hypnotiz- 
ability were related to their subsequent hyp- 
notic performances. 

2. Females’ attitudes toward hypnosis were 
related to their subsequent performances. 
This did not hold for males. 

3. Ideational interests were positively re- 
lated and motoric interests were negatively 
related to hypnotizability. 

4. Experience inventories containing ques- 
tions about hypnotic-like experiences in daily 
life were positively related to hypnotizability. 

5. A measure of acquiescence response set, 
consisting of the number of true responses to 
a 235-item version of the MMPI, was posi- 
tively related to hypnotizability. None of 
the standard content scales of the MMPI was 
consistently found to be related to hypnotic 
performance. : 

Barber’s (1964) review was more pessimistic 
and pointed in a different direction: 

The overwhelmingly null and contradictory a 
ings in this area strongly suggest that we should bi n 
for factors other than “personality traits d 
might account for interindividual variability n 
intraindividual consistency in response to suges 
The data... suggest that we might pron 
focus on the following variables which are re 
enduring and more situationally-variable 2A D 
characteristics that are presumably mena A 
personality inventories. (1) S’s attitude towarc À 
relationship with E (2) S’s attitudes or goals in 
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test situation and his motivation to perform well 
or poorly on the assigned experimental tasks [p. 313]. 


Many of the personality scales which have 
been used in these studies represent attempts 
to measure affect traits such as anxiety, de- 
pression, and hostility. If Barber is correct 
in his appraisal of this field, the measurement 
of affect traits is inappropriate, and the 
measurement of affect states is more pertinent 
to subsequent hypnotizability. The distinction 
between trait and state, formulated by Cattell 
and Scheier (1961), Zuckerman (1960), and 
others, is relevant to this problem of pre- 
dicting hypnotizability. An affect trait is a 
telatively reliable level of mood over an ex- 
tended or indefinite period of time. An affect 
state represents an affect level at some limited 
period of time ranging from the immediate 
moment to a day. If the subject’s attitudes 
in the hypnotic test situation are more crucial 
to hypnotizability than his general attitudes, 
we would hypothesize that the subject’s affect 
State, just prior to hypnosis, is more crucial 
for hypnosis than his general mood level. 

One cannot measure affect states with the 
same instruments used to measure affect 
traits. First, most questionnaires ask a sub- 
ject how he “generally” or “usually” feels 
tather than how he feels at the moment or on 
a particular day. In addition, such question- 
naires often show systematic changes on re- 
peated testing. Many are lengthy and require 
so much time to administer that a subject’s 
State could change during the testing 
Procedure, 

Zuckerman and Lubin (1965) have de- 
veloped a Multiple Affect Adjective Check 
List (MAACL) to measure affect states. It 
Consists of 132 adjectives with affective con- 
notations. Three scales have been developed 
using an empirical item-selection technique: 
anxiety, depression, and hostility. The items 
selected for each scale were based on the 
Tesponses of patients rated as high on the 
Particular affect, on responses of normals 
Who were put into the particular affect state 
Using hypnotic techniques, or on a com- 
bination of these two criteria. 

The test has proved extremely sensitive to 
stress effects of school examinations (Hayes, 
1966; Lieberman, 1966; Zuckerman, 1960; 
Zuckerman, Lubin, Vogel, & Valerius, 1964), 
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sensory deprivation (Zuckerman, Albright, 
Marks, & Miller, 1962; Zuckerman, Persky, 
Hopkins, Murtaugh, Basu, & Shilling, 1966), 
army basic combat training (Datel, Engle, & 
Barba, 1966; Datel, Gieseking, Engel, & 
Dougher), actual combat (Bourne, Coli, & 
Datel, 1966), negative personal evaluations 
(Weaver, 1965), a forced approach to a 
phobic object (Geer, 1965), fear-provoking 
movies (Bringmann, 1966; Folkins, Lawson, 
Opton, & Lazarus, 1968; Zuckerman et al., 
1964), hypnotic suggestion of anxiety (Levitt, 
Perskey, & Brady, 1964), and overindulgence 
in alcohol (Williams, 1966). Changes in a 
positive direction have been produced by brief 
psychotherapy (Goldstein, 1965), tranquiliz- 
ing drugs (Hankoff, Rudorfer, & Paley, 
1962), desensitization procedures and cogni- 
tive reappraisal (Folkins et al., 1968), and 
small quantities of alcohol (Williams, 1966). 
Significant correlations have been found be- 
tween the anxiety and depression scales and 
clinical ratings of these affects (Zuckerman, 
Lubin, & Robins, 1965; Zuckerman, Persky, 
Link, & Hopkins, 1967) as well as other 
test measures of these affects (Zuckerman & 
Lubin, 1965). 

The anxiety scale of the MAACL has been 
used in the series of experiments by Levitt 
et al. (1964) on the hypnotic induction of 
anxiety. It has proved to be the most sensi- 
tive measure used to assess this hypnotically 
produced state, One interesting sidelight to 
these experiments was the finding of lower 
than normal anxiety levels in subjects in the 
hypnotic state before anxiety suggestions 
were introduced. Wolpe (1958) has hy- 
pothesized that anxiety and muscle relaxation 
are reciprocally inhibiting. In Wolpe’s be- 
havior therapy technique patients are re- 
laxed by suggestion prior to the presentation 
of mild anxiety stimuli. It would follow that 
intense anxiety is inhibiting to subsequent at- 
tempts at relaxation. Since relaxation tech- 
nique is the core of most hypnotic methods, 
a high anxiety state prior to such induction 
would be expected to interfere with hypnosis 
and result in a lower level of hypnotizability. 
This is probably assumed by most good hyp- 
notists who attempt to put the subject at 
ease and reassure him about hypnosis be- 
fore starting induction. 
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Barber (1964) has suggested that the sub- 
ject’s attitude and relationship to the hyp- 
notist is a crucial variable. Presumably, a 
friendly and trusting attitude is more favor- 
able to hypnosis than a hostile, suspicious 
one. Secter (1960) conducted an experiment 
in which subjects in one group were told 
peremptorily that they would be hypnotized 
(authoritarian approach), while subjects in 
the other group were told in a permissive 
manner that they could learn to hypnotize 
themselves and were given the option of not 
participating (all did participate). Only 17% 
of the group treated in an authoritarian 
manner achieved “deep hypnosis,” while 54% 
of the subjects treated permissively were able 
to reach this state. We might postulate that 
the authoritarian approach elicited hostility 
in many subjects and that this state of hos- 
tility was antagonistic to deep hypnosis. If 
hostility does inhibit hypnosis, then individ- 
uals who are hostile or suspicious, for what- 
ever reason, prior to the induction would not 
be expected to go very deeply into the hyp- 
notic state. 

The hypothesis of this study is that the 
affect states of anxiety, depression, and 
hostility, assessed just prior to hypnotic in- 
duction, are negatively related to subsequent 
hypnotic susceptibility; general traits of anx- 
iety, depression, and hostility are unrelated to 
hypnotic ‘susceptibility. 


METHOD 


The data for part of this study were obtained 
from screening procedures used to select Ss for an 
experiment involving the hypnotic induction of 
emotional states, Advertisements for male Ss, 21 
years of age or over, were posted at the college em- 
ployment bureaus in the Philadelphia area. The 
advertisement stated that the experiment involved 
hypnosis, but it did not go into further detail re- 
garding procedures. 

Small groups of respondees were brought into a 
meeting room for initial screening. The groups 
ranged in size from 3 to 12. The Ss were first in- 
formed of the general experimental procedures 
which would be used if they were acceptable, At 
that point they were given the option of leaving 
if they did not want to participate in the experi- 
ment, None of the Ss quit at that point, possibly 
because of the high monetary reward: $100 for 4 
6-hour days after the Screening session. 

The Ss were then given the MAACL and were 
instructed to check all the words which described 
their feelings “now-today.” 
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After the MAACL, the Ss were given Shor and 
Orne’s (1962) Harvard Group Scale of Hypnotic 
Susceptibility (HGSHS). This scale is administered 
using a tape recording 8 which contains preliminary 
explanations, reassurances, the induction monologue, 
and the suggestion tests. After the suggestion tests, 
a posthypnotic suggestion and an amnesia sug- 
gestion are given, In order to check the amnesia, 
S is “awakened” and asked to write down every- 
thing that happened, He then fills out a question- 
naire reporting his behavioral reactions and sub- 
jective experiences during the various suggestions, 
This self-report score correlates highly with scoring 
by observer raters on behavioral items (Shor & 
Orne, 1962). 

After the HGSHS, the Ss were given the MMPI, 
The MMPI was scored on Taylor (1953) Manifest 
Anxiety (MA) scale, the Hathaway and McKinley 
(1951) Depression scale, the Schultz (1955) Hostility 
Control scale, and the Siegel (1956) Judged Mani- 
fest Hostility scale. The MA was developed using 
rational item-selection methods, while the Hatha- 
way and McKinley scale was developed using an 
empirical item-selection method. Both of these tech- 
niques have shown moderately high correlations 
with clinical rating criteria of the relevant affects. 
The validities of MMPI hostility scales have never 
been adequately demonstrated, and none of them 
seem to be highly related to clinically defined hostil- 
ity (Shipman, 1965). For this reason we selected two, 
rather than one, of these scales. The Siegel scale 
was developed using a rational item-selection tech- 
nique, and the Schultz scale was derived using 
empirical item-selection technique. The MMPI was 
also scored for the 3 validity scales, the 10 standard 
clinical scales, and 11 additional scales. This scoring 
was done using an electronic computer scoring 
service. 

The three runs of Ss tested in this fashion were 
as follows: Run 1, n=41; Run 2, »=38; and 
Run 3, »=33. Runs 2 and 3 were used as replica- 
tions for the data obtained on Run 1. 

At the time Run 1 was completed, but before the 
collection of data in Run 2, an attempt was made to 
replicate the findings in a large single group of 
49 male and female students from a psychology 
class at the University of Pennsylvania. This group 
was told in advance that the testing would take 
place during one of their class sessions. They were 
given the option of not participating. A few students 
came just to observe or to see what it was ie 
The possibility of being selected for later researi 
was mentioned, but no details as to payments oF 
procedures were mentioned. 


RESULTS 
Table 1 shows the correlations between h 
HGSHS scores and state (MAACL) an 
trait (MMPI) measures of anxiety, depres- 
sion, and hostility in the three sets of subjects 


8 Recorded by L. Dumas, technical supervision by 
D. N. O’Connell. i 
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TABLE 1 
CORRELATIONS or MAACL STATE AnD MMPI TRAIT AFFECTS WITH HARVARD GROUP SCALE 
OF HYPNOTIC SUSCEPTIBILITY 
MAACL scale MMPI scale 
Condition N x 
Anxie Di ion | Hostili i j ili ili TER 
ty | Depression | Hostility | Anxiety* | Depression? | Hostility*| Hostilitya 

Small groups 

Run 1 41 =36%* aii — .48** 04 .06 .09 02 7.05 

Run 2 38 | —.23 oi e a es E0 E] | —.04 01 | 6.50 

Run 3 33 —.27 —.48** —.36* aed —.32 —.08 = TiS 
Large group 49 —.05 .00 13 5.63 


à Taylor Manifest Anxiety scale, 

b Hathaway and McKinley Depression scale. 
e Schultz Hostility Control scale. 

‘4 eae udged Manifest Hostility scale, 


p < 101; 


tested in small groups of 3 to 12, and the 
large group tested in the class at the Uni- 
versity of Pennsylvania. The mean scores of 
the four groups on the HGSHS are also 
presented. 

The subjects tested in small groups gen- 
erally exceeded the subjects tested in the 
large group on hypnotic susceptibility. The 
mean HGSHS score of all subjects run in the 
small groups was 6.84, and the mean score of 
the subjects run in the large group was 5.63. 
The difference between these means yielded 
a t of 2.42, significant below the .02 level. 
Mean differences between the runs in the 
small groups were small and insignificant. 

The anxiety-state score correlated negatively 
and significantly with HGSHS scores in Run 
1 only. The depression-state score correlated 
hegatively and significantly with HGSHS 
Scores in Runs 1 and 3, but not in Run 2. 
The hostility-state score correlated negatively 
and significantly with HGSHS scores in all 
three runs. None of the affect-state scores 
Correlated significantly with the HGSHS 
Scores in the large group. The lack of correla- 
tion in this group cannot be attributed to a 
restricted range of scores on the HGSHS. 
Scores in this group were found along the 
entire range from O to 12, and the variance 
Was slightly larger than in two of the three 
small-group runs. None of the affect-trait 
Scores correlated significantly with the 
HGSHS in any of the runs. 

The results confirm the general hypothesis 
for hostility and tend fo" confirm it for de- 


pression. However, the hypothesis of relation- 
ships between affect states just prior to hyp- 
nosis and hypnotizability was only confirmed 
for the highly motivated subjects run in 
small groups. No relationships between affect 
states and hypnotizability were found in the 
large semicaptive group. 

Although no formal analysis was done on 
the other MMPI trait scales which were 
scored, the means of high hypnotizables and 
nonhypnotizables were examined. These mean 
standard scores were all within three points 
difference, making significance highly un- 
likely. The only exception was the control 
scale where the nonhypnotic subjects scored 
about six points higher than the high hyp- 
notic subjects; even this difference was not 
significant (¢ = 1.81, p > .05). 

Another analysis was performed on the 112 
subjects run in small groups in order to see 
if the relationship between the hostility-state 
measure and hypnotizability held over the 
entire range of hypnotizability. Subjects were 
classified into high hypnotics, low hypnotics, 
and nonhypnotics using the cutting points 
established prior to the study. Subjects’ scores 
on the MAACL hostility scale were classified 
as low or high using the group median as a 
cutting point. The resulting classifications 
are contained in Table 2. 

This cross-classification yielded a x° of 
13.74 (df = 2, p < .01). It can be seen that 
the relationship between these variables is 
produced by the low hostility scores of the 
high hypnotic subjects relative to the scores 
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TABLE 2 


RELATION BETWEEN MAACL Hostiniry 
AND HyPNOTIZABILITY 
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TABLE 3 


CORRELATIONS BETWEEN ACQUIESCENCE 
MEASURES AND HyPNOTIZABILITY 


Hostility 
Subject p—s 
Low | High | Total 


High hypnotic (HGSHS = 9) | 24 11 35 
Low hypnotic (HGSHS 4-8) 18 39 57 
Nonhypnotic (HGSHS < 3) T 13 20 

Total 49 63 112 


of the subjects in the lower classifications. 
The low hypnotics and nonhypnotics showed 
about the same distribution of low and high 
hostility scores, with about twice as many 
falling into the high hostility category. In the 
high hypnotics about twice as many subjects 
fell into the low hostility category as into the 
high hostility classification. 

The final analysis concerns the relation- 
ships between hypnotizability and acquies- 
cence response set postulated by Hilgard. 
These results are contained in Table 3. 

Using Hilgard’s measure of acquiescence 
response set, number of “true” responses on 
the MMPI, no significant correlations with 
hypnotizability were found in any of the 
runs. Using an MAACL measure of acquies- 
cence, number of adjectives checked, the cor- 
relations with hypnotizability tended to be 
positive, but only reached significance in 
Run 1. The two measures of acquiescence 
tended to be positively correlated in Runs 1 
and 2, but only the correlation in Run 2 
reached significance. 

Although the correlations between number 
of adjectives checked on the MAACL and 
hypnotizability were low in Runs 2 and 3, 
the correlations between number of adjectives 
checked and the hostility scores were high 
and negative, ranging from —.61 to —.84. 
If these correlations were partialed out of 
the correlations between hostility and hyp- 
notizability, they would reduce them to in- 
significance. Conversely, if the correlations 
between number of adjectives checked and 
hypnotizability were partialed out of the cor- 
relations between hypnotizability and acquies- 
cence, the resultant correlations would all be 
insignificant. It would seem that both the 


small | MAACL | MMPI | MMPI 
no. no. versus 
Broups | checked | “true” | MAACL 
Run 1 .49** .19 30 
Run 2 20 01 .33* 
Run 3 33 Li B 
05, 
=b S01 


tendency to check many adjectives and the 
more specific tendency of the good hypnotic 
subjects to check nonhostile or “friendly” 
adjectives and not to check hostile adjectives 
produce the relationship between MAACL 
hostility scores and hypnotizability. 


DISCUSSION 


The results of this study tend to confirm 
the hypothesis that affect states just prior 
to hypnosis are related to subsequent hyp- 
notizability, but that general affect traits are 
unrelated to hypnotizability. These conclu- 
sions must be qualified by the following: _ 

1. Affect states are related to hypnotiz- 
ability in motivated volunteer subjects run 
in small groups, but not in less motivated 
subjects run in large groups. This restric- 
tion may be due either to motivation or group 
size. According to Orne,* the motivation 1s 
probably the more crucial variable in produc- 
ing higher HGSHS scores. At any rate, the 
motivation of the subject is probably a source 
of variation in hypnotizability which would 
tend to obscure the relationship between the 
affect state of the subject and hypnotizability. 

2. Hostility, in particular, seems to be the 
affect most consistently related to hypnoti7- 
ability. The high hypnotizables’ low hostility 
or “friendliness” in contrast to the nie 
hostility in subjects from all lower levels 0 
hypnotizability produces this relationship. 

3. The relationship between hostility si 
and hypnotizability is complicated by m 
confounding factor of a correlated Bet 
set on the measure used to measure hos i 
state. Perhaps the tendency of the high hyp 
notizables to check many adjectives was 3” 
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other expression of their cooperative and 
friendly attitude in the situation. It is not 
an expression of a generalized trait of acquies- 
cence judging from the correlations between 
the MAACL and MMPI acquiescence meas- 
ures. Other authors (Gage, Leavitt, & Stone, 
1957; McGee, 1962) have questioned whether 
there is a generalized acquiescence “style.” 

A relationship between a state measure and 
hypnotizability is of no practical significance, 
since it cannot aid in selection of good po- 
tential hypnotic subjects before the hypnotic 
testing session. But these replicated results, 
in an area where investigators have been 
pursuing personality trait correlations, point 
up the importance of the distinction between 
state and trait. The findings may have im- 
portant implications for other areas of re- 
search where investigators are attempting to 
find correlations between situation-specific 
responses and generalized personality traits. 
Few would argue against the supposition 
that the subjects’ immediate motives and 
feelings are more important in determining 
their immediate responses than past motives 
and moods. Yet most investigators use meas- 
ures which call for self-reports on past motive 
States and moods rather than present ones. 
Simple self-report instruments are needed to 
test current states, since the usual question- 
naire is too cumbersome for this purpose. 
Self-ratings tend to be ad hoc devices with 
little standardization or subtlety. An instru- 
ment like Gough and Heilbrun’s (1965) Ad- 
Jective Check List could easily be adapted as 
a motive-state measure, and the MAACL 
Seems to be a good mood-state measure. 

If trait measurements are desired, then an 
alternative to the single-shot omnibus ques- 
tionnaire approach could be a time sampling 
approach using repeated testings with state 
Measures. This would give an estimate of the 
variability of a motive or mood, in addition 
to the mean level. Mean level of a state over 
time could be regarded as a trait measure. 
In the course of such testing, responses to 
different kinds of situations could be sampled. 

hile involving more test contacts (of briefer 
duration), such an approach may yield a 
More accurate picture of personality because 
it would consider the powerful effects of 
Situations and time trends in behavior. 
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INTEGRATIVE BEHAVIOR IN ADOLESCENT BOYS AS 
A FUNCTION OF DELINQUENCY AND RACE? 


V. EDWIN BIXENSTINE anp RALPH L. BUTERBAUGH ? 


Kent State University 


This study explores integrative behavior (ie, behavior maximizing reward 
over time) in human Ss (88 adolescent boys) all evincing, via intelligence test 
performances, comparable symbolizing capacity, The findings were that delin- 
quency is related to integrative failure as measured by choice of a small but 
immediate reward over a large but remote reward (candy). However, con- 
trary to expectancy, Negro boys behaved more integratively than white boys, 
These findings are discussed in connection with results found on auxiliary 


measures made on all Ss. 


Integration as a conceptual bridge between 
learning psychology and ego psychology was 
first set forth by Mowrer and Kluckhohn 
(1944), If an organism adjusts by learning 
behaviors which maximize immediate rewards, 
it integrates by learning behaviors which 
maximize rewards over time. Mowrer and 
Ullman (1945) discovered that rats, while 
fair adjustors, are marginal integrators, being 
unable to surmount limits imposed by the 
gradient of reinforcement. However, Bixen- 
stine (1956) demonstrated that when the rat 
is afforded a sign concomitant with the im- 
mediate effect, but representing the remote 
absent effect, then integration occurs with- 
out measurable limitation by the temporal 
gradient, 

Tf animals are dependent on signs supplied 
from without, humans are notorious self- 
producers and users of signs in the interest 
of “time-binding” (Korzybski, 1933), which 
accounts for the immense superiority in inte- 
Station of sign-rich humans over sign-poor 
subhumans, However, remaining to be ex- 
plained is the relative superiority found among 
humans, all of whom are sign users. One 
relevant dimension is intelligence which, if 
Unitary at all, is surely so in terms of a 


“This study was supported in part by Grant No. 
M3291c2 from the National Institute of Mental 
Health. The authors wish to thank Charles Boltuck, 
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anne at Apple Creek State Hospital, Apple Creek, 


sign-using factor. But another dimension, of 
largely undetermined structure and complex- 
ity, is character. This research studied the 
relationship between character (operationally 
defined by delinquency versus nondelin- 
quency), race, and integrative behavior. 
Our procedure was adopted from Mischel 
(Mischel, 1958, 1961a, 1961b, 1961c; 
Mischel & Gilligan, 1964; Mischel & Metz- 
ner, 1962), who has already broken consider- 
able ground. Mischel presented children with 
the choice of a small-immediate versus a 
large-remote reward and found that: (a) 
Negro children of Trinidad island prefer the 
small-immediate reward more than East In- 
dian children of the same island; (b) prefer- 
ence for delayed gratification is related to 
need for achievement and to social responsi- 
bility (for Caribbean island subjects); and 
(c) American children were found to prefer 
the large delayed reward as a function of the 
delay interval and the age and intelligence 
of the subject. Melikian (1959) also found 
(using Palestinian Arab children) a relation- 
ship between intelligence and integration 
(choice of large delayed reward). ; 
Mischel (1961c) used delinquent children 
(Caribbean) in his study of social responsi- 
bility and integration. However, he did not 
control the age and intelligence which, in 
view of his own and Melikian’s findings, 
would be indicated. The present study at- 
tempted to control these factors and 
introduced minor modifications on Mischel’s 
procedures, as well as certain secondary 
dependent measures thought to be relevant. 
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METHOD 
Subjects 

The Ss were 88 boys, ages 13 to 16 years, half 
of whom were drawn from a school for delinquents 
to which they had been committed by court order. 
Of these delinquent boys, 22 were Negro and 22 
were white. The 44 nondelinquent boys came from 
a junior high school in a nearby industrial city 
similar to that from which the delinquent group 
originated. This school had a large Negro representa- 
tion and served primarily low-middle and low-class 
children. The Ss with records of delinquency were 
eliminated. Each nondelinquent was chosen to match 
a delinquent boy of the same age, color, and intel- 
ligence. It is presumed that a gross match of socio- 
economic background was achieved by the selection 
of the school. The experimenter administered the 
California Test of Mental Maturity, Short Form, 
junior high level, to the delinquents; scores on 
this test were posted yearly for the nondelinquents 
by their school. 

IQ scores on the California test for delinquents 
ranged 75-109 with a mean of 89.4; for the non- 
delinquents the range was 76-114 with a mean of 
89.9. The age range of delinquents was 13 years, 
1 month to 16 years, 11 months with a mean of 
15 years, 2 months. Differences between matched 
pairs did not exceed 9 IQ points nor 10 months 
in age. 


Procedure 


The Ss were asked to take part in research in- 
volving certain simple tasks, At the end of this 
presumed data collection, Ss were given as “pay- 
ment” the choice of an immediate-small versus 
delayed-large reward (candy). Although choice be- 
havior was the dependent variable of principal 
interest, rather than give S tasks of a trivial or 
irrelevant nature, we selected tasks useful in pro- 
viding additional understanding of S’s behavior. 
Following the explication of procedure, a description 
of these secondary, dependent variables and the 
rationale for their selection will be set forth. 

The Ss were gathered in groups of 44 on successive 
days of the week and at the same time during the 
day. They were seated far enough apart in rooms 
at the schools to prevent viewing each other’s 
written responses. Clocks in the rooms were covered, 
and Ss were asked to remove and place inside their 
desks any watches on their persons. 

After the E had administered two of the tasks 
(The Porteus Maze Test and time estimation), he 
thanked the Ss and explained that as payment for 
their participation he would dispense 5¢ Milky Way 
bars. However, since E was concerned that he 
might “run short” of bars, any S who so indi- 


®Ohio law defines the delinquent as any child 
under 18 years of age who violates the law, who is 
intractible to control from Parents, teachers, or 
guardian, who is truant, or who is injurious to self 
or others. 
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cated could wait until Æ returned in 1 week and 
receive not one, but three 5¢ Milky Way bars, 
Mischel has followed a similar Procedure, ex- 
cept he has used varying sizes rather than numbers 
of bars. Also, Mischel had the S make a sign indi- 
cating his choice on his record sheet, which was then 
handed to E. As this in itself introduced a minor 
delay, the Æ in the present study immediately 
passed out the candy to Ss who signaled that this 
was their choice. 

Following the candy choice, Æ “discovered” addi- 
tional time was at hand and invited Ss to perform 
one more task. The sequence of tasks was, then: 
(1) Years X, XI, XII, and XIV of the Vineland 
revision of The Porteus Maze Test; (2) an estima- 
tion of the passage of time; and, (3) following the 
candy choice, a choice of immediate versus delayed 
hypothetical reward as presented in story form, 
These will be described in turn. 

1. The mazes. Years X, XI, XII, and XIV were 
selected because pilot work with early teenage chil- 
dren revealed that these levels gave a fairly wide 
range of error scores. The Ss were given one level 
at a time. They were told not to lift their pencils 
from the paper and to retrace when necessary in 
order to exit from the maze. Scores were errors in 
terms of number of cul-de-sacs entered. The mazes 
presume to measure planfulness and orderliness and 
should relate to integrative candy choice and to 
delinquency. 

2. Time estimation. Levine, Spivack, Fuschillo, 
and Tavernier (1959) found a relationship between 
an estimation of elapsed time and “motor inhibition 
as measured by control over script written between 
narrow lines. We employed a similar measure, „but 
instead of asking S to say “stop” following a given 
time, E called “stop” and asked S to estimate or 
give the time. This method seemed essentially 
equivalent to that of Levine et al, but easier to 
use with a group. This task was included because 
the work of Levine et al. suggested that brevity 
of estimated time is related to impulse control. The 
Ss who show delaying ability ought to report that 
the elapsed time appeared short, whole those lacking 
symbolic capacity should experience time a 
dragging and report longer estimates. Time PERET 
were chosen which, according to pilot work wil 
teenage children, rendered a broad range of BL 
These were, by the order administered: 60, 30, n 
and 90 seconds (a total of 300 seconds). The 
were asked to estimate the time elapsing benon 
the signals “start” and “stop” in minutes añ 
seconds and record this on paper. Pg: 

3. Story-reward choice. The following story ti 
tions, accompanied by two choices, were rea 
the Ss: 


Your grandmother gives you a savings ae 
for Christmas, if you cash it now, it W i 
worth $18.75, but if you wait a few years, 
will be worth $25.00. Would you rather: (a) 
in the bond now, (b) wait until it 1 


more money. 


worth 
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TABLE 1 
CHOICES oF IMMEDIATE AND DELAYED CANDY 
Delinquent Nondelinquent Total 
Race 
Delay Immediate Delay Immediate Delay Immediate 
Negro 17 5 21 1 38 6 
White 10 12 18 4 28 16 
Total 27 17 39 5 66 22 


An uncle of yours dies and leaves you a valu- 
able stamp collection, which, you are told, becomes 
more valuable every year. If you did not want 
to keep the stamps for your collection, would 
you rather: (a) Sell them now and have the 
money to spend, (b) keep them for a while and 
get a much higher price for them later. 


You get a job after school in the afternoons 
and your boss asks whether you would rather be 
paid every day or once a week. He offers to give 
you a dollar more a week if you will wait until 
the end of the week to get paid. Would you 
Prefer: (a) To get paid once a week, (b) to get 
paid every day. 

You win a contest and the prize is $1,000. Your 
prize money can be paid to you in two ways: 
$1,000 right now or $100 each month for a year, 
which is actually $200 more in a year’s time. 
Which would you choose: (a) $1,000 in one pay- 
Ment, (b) $100 a month for a year. 


“Buffer” stories separated each of the above and 
dealt with similar content but presented a choice 
other than that between immediate versus delayed 
rewards. The Ss were asked to register which 
choice they preferred. The rational for this 
task is clear: A relationship should obtain between 
actual choice of an immediate or a delayed reward 
and a hypothetical choice of the same nature. 


RESULTS 
Primary Dependent Variable 


In Table 1, the data are represented for 
choice of immediate-small versus delayed- 
large candy. Delinquent Ss chose candy im- 
mediately more often than did nondelin- 
quents (17 of 44 compared to 5 of 44); 
however, white Ss, both delinquent and non- 
delinquent, chose immediate candy more 
often than Negro Ss (16 of 44 compared 
to 6 of 44), The chi-square values in Table 2, 
derived from the frequencies in Table 1, make 
evident that these ratios are statistically sig- 
nificant, Noteworthy, too, is that while white 


delinquents differ significantly from nondelin- 
quents, Negro delinquents do not differ sig- 
nificantly from Negro nondelinquents. This 
is in keeping with the fact that the major 
delinquency effect appears (from Table 1) to 
be white rather than Negro. 


Secondary Variables 


Correlation coefficients were computed be- 
tween six measures, the two independent and 
four dependent variables, and are presented 
in Table 3.* Looking at the first row (delayed 
candy choice), we see that the essentials of 
the analysis presented in the previous section 
are recapitulated: Delinquents correlate nega- 
tively with delayed candy choice while Ne- 
groes correlate positively therewith. However, 
only one of the three auxiliary variables 
correlates with delayed candy choice: Esti- 
mated elapsed time is significantly shorter for 

4Pearson product-moment r’s were computed 
where feasible. Biserial and tetrachoric r’s were 
computed where indicated and necessary. The records 
of 82 Ss were used in computing 7’s, as the data 
for 6 Ss were incomplete in one or more of the 
auxiliary measures. 


TABLE 2 
CHI-SQUARE VALUES DERIVED FOR CANDY CHoIce 
Comparison x t 
abha Sees i, 
Nondelinquent vs. delinquent 7.332 | .01 
Negro vs. white 4.910 | .05 
White delinquent vs. white 4.810 | .05 
nondelinquent 
Negro delinquent vs. Negro 1.736 | .20 
nondelinquent 
Negro delinquent vs. white 3.451 | .10 
delinquent 
Negro nondelinquent vs. 903 | — 
white nondelinquent 
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TABLE 3 
CORRELATION COEFFICIENTS BETWEEN FOUR DEPENDENT AND TWO INDEPENDENT VARIABLES 
Primary 
dependent Independent variable Auxiliary dependent variable 
variable 
Variable 
i Delayed 
peleved Negro vs. Delinguent Maze Estimated story 
as y white and Mest errors time reward 
choice leling kai 
Delayed candy choice — .293*** —.279** .062 —.244* 017 
Negro vs. white 93h" — —.0025 .191 —.040 =.153) 
Delinquents vs. —279%* —.0028 a =.301%** .318*** —.249* 
nondelinquents 
Maze errors .062 191 —.391*** — —.052 004, 
Estimated time —.244* —.040 .318*** —.052 = —.042 
Delayed story .017 —.153 —.249* 004 —.042 — 
reward choice 


* The v is not precisely zero due to the fact that six Ss were not employed in computing correlation coefficients, 


*p <.05. 


those choosing the delayed candy. Conspicu- 
ous and surprising is the absence of a rela- 
tionship between delayed candy choice and 
delayed story reward choice, the latter being 
in form the same as the former, except that 
the choice is hypothetical rather than real. 
Of additional interest is the fact that, while 
these measures were uncorrelated with each 
other, they both correlate significantly (if 
inversely) with delinquency. (Delinquency is 
the only variable with which delayed story 
reward choice does correlate.) The implica- 
tion is that these measures tap at least two 
factors and that delinquency is a function of 
both. 

Delinquency correlates significantly with 
all four dependent measures. The largest r 
(—.391) is with errors made on the Porteus 
mazes. This correlation is remarkable, how- 
ever, for a reason other than its size; it is 
opposite in sign to that most reasonable to 
predict. As expected, delinquents as com- 
pared with nondelinquents chose immediate 
rewards (real or hypothetical) and over- 
estimated the passage of time (the mean 
cumulative estimation of 338.4 seconds 
compared to 274.4 seconds for an actual 
period of 300 seconds); but on a task often 
presumed to assess forethought, impulse 
control, and planfulness, they did sur- 


prisingly and significantly better than non- 
delinquents. However, maze errors do not 
correlate with any of the other dependent 
variables, which now fosters the implication 
that delinquency is a function of at least 
three factors—one of which is common to 
choice of both immediate over delayed candy 
and to overestimation of time; one common 
to the hypothetical choice of immediate 
reward; and one common to freedom from 
maze errors. 

Under these circumstances we would expect 
that a multiple correlation between all four 
dependent variables ought to show a sizable 
improvement over the best single coefficient 
found (viz., delinquency with maze errors, 
—.391). A multiple R was computed and was 
found to be .576. The difference between 
this coefficient and 391 renders a ¢ of 2.569 
(df = 81) which, for a one-tailed test, 1 
significant at p < .01.° 

5In order to calculate this ¢, cognizance WaS 
taken of the correlation between the correlations. 
Clearly, a multiple correlation will itself correlate 


with one of the factors which comprise it. 
correlation coefficient was estimated using 


following: 
[Bpm (rD) 
Tapal RE 


4 ion be- 
where rfr:rp;ar is the coefficient of correlation b 


the 
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Discussion 


The results confirm a direct relationship 
between integrative choice of a large delayed 
reward and delinquency. Of some note is the 
finding that Negro adolescents, and particu- 
larly delinquent Negroes, were more likely to 
choose the large delayed reward than were 
whites. Since an endeavor was made to con- 
trol effects due to age, intelligence, and socio- 
economic influences, it is reasonable to assign 
this difference in integrative behavior to a 
nonintellective, nonclass factor(s) which was 
labeled character. 

If freedom from acts of delinquency, on 
the part of children whose social circum- 
stances may often tempt them to such acts 
by local example, is our hallmark of ef- 
fective character development, then it is to be 
hoped that delinquency, as a variable, cor- 
relates with measures ostensibly sensitive to 
character differences. Noteworthy is the fact 
that delinquency relates not merely to in- 
tegrative choice of reward, but to all three 
secondary variables in addition. The multiple 
R of .576 indicates that a weighted combina- 
tion of our dependent variables would do a 
fairly presentable job of predicting delin- 
quency. This statement is defensible in light 
of the fact that the delinquency and non- 
delinquency groups were equated on intel- 
lective and socioeconomic grounds which are 
frequently cited as themselves pertinent to 
the development and appearance of delinquent 
behavior, 

That the multiple R did improve markedly 
Over any component 7 suggests that delin- 
quency is not a simple or unifactored state. 
Integrative choice of reward shared variance 
with the estimation of elapsed time, but maze 
errors and choice of a hypothetical reward 
Were each independent of the other variables. 
Thus, delinquency appears to be a function 
of at least three independent factors. Are 
these all characterological in nature? At this 
Point there is no evidence to argue other- 
Wise, and the conclusion is that character is 
ikewise multifactored in structure. Integra- 
Se) nOn 
tween the multiple R and the r between delinquency 
and maze errors; fp. is the beta weight for the 
delinquency-maze-error relationship; and ro: is the 
correlation between delinquency and maze errors. 
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tive choice of a large delayed reward would 
appear to tap, in common with a measure 
of the person’s sense of time, only one side 
of a many-sided character structure. 

Simplicity would have been served had one 
central characterological measure been found 
to have a dominant relationship with delin- 
quency. We must conclude, however, that 
integrative choice, though useful, does not 
serve this role and that the evidence is that 
no unifactored measure could. The multi- 
plexity of delinquency makes it possible that 
the Negro as compared with the white delin- 
quent, even though similar in regard to age, 
intellective, and socioeconomic factors, is 
nonetheless a different kind of delinquent 
and, presumably, has a different kind or type 
of delinquent character. Just how these kinds 
of character would be construed is, at this 
juncture, open to speculation and further 
research. 
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The purpose of this research was to follow up some recent findings of Gordon 
which showed that psychological interpretations based on partial Rorschach 
protocols were related to the personality dimension of anality in the clinicians. 
In the present study, 44 practicing clinicians were asked to respond to a 3-page 
document which they were told was a partial transcript of an initial inter- 
view with a patient. The results confirmed the earlier findings that high-anal 
clinicians have less confidence in their interpretations, make fewer specific 
predictions, and identify less pathology in patients than low-anal clinicians. 
This study provided additional evidence that the personality of the clinician in- 
fluences his clinical decisions. Additional hypotheses related to interactions 
between the personality of the patient as presented in the clinical data and the 


personality of the clinician were not confirmed. 


The purpose of this research was to follow 
up some recent findings of the experimenter 
(Gordon, 1966). The main findings of the 
previous study had to do with the effects of 
the personality of the clinician on his psycho- 
logical interpretations of data derived from a 
Rorschach protocol. It was found that clini- 
cians who score high on a paper-and-pencil 
Measure of anality have less confidence in 
their interpretations, make fewer specific pre- 
dictions about the patient, and find less pa- 
thology in the patient than clinicians who 
Score low on the test of anality. These find- 
ings confirmed predictions based on the psy- 
choanalytic view of anality (Fenichel, 1945; 
Schafer, 1954) and indicated that the clinical 
decision is not independent of the clinician’s 
Personality. 

_ The present study was designed to estab- 
lish whether these predictions held true when 
the clinician was presented with a somewhat 
different quasiclinical situation. Instead of a 
Partial Rorschach protocol, the clinicians re- 
ceived a three-page document which they 
Were told was a partial transcript of an ini- 
tial interview with a patient. There were two 
Versions of the transcript—one presenting a 
Patient with many anal characteristics and 
one presenting a patient with fewer anal 
characteristics. Thus, the present research at- 
tempted to study not only the effects of anal- 
ity in the clinician on clinical decision mak- 


ing, as in the previous study, but also to 
study the effects of interactions between 
anality in the clinician and anality in the 
patient. 

The decision measure was a revised version 
of the Interpretive Decision Scale (IDS) 
which the investigator devised for the earlier 
study. It consists primarily of lists of inter- 
pretations which are to be checked by the 
subject if he thinks they are appropriate for 
the available clinical data. Three decision 
measures (those which led to fruitful findings 
in the previous research) were derived from 
the scale: 

1. Confidence score—number of interpreta- 
tions about which subjects were confident. 

2. Positive score—proportion of checked 
interpretations which were positive, which 
referred to inferences of health from the data. 

3. General score—proportion of checked 
interpretations which were general, that is, 
did not make specific predictions about the 


patient. 
HYPOTHESES 


Confidence Score 

Subjects low on anality have higher confi- 
dence scores than subjects high on anality. 
This hypothesis was based on the psycho- 
analytic conception (Fenichel, 1945; Schafer, 
1954) of anality as involving vacillation, in- 
decision, and lack of confidence. This hy- 
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pothesis was confirmed at the .02 level of 
significance by Gordon (1966). 


Positive Score 


Subjects low on anality have lower positive 
scores than subjects high on anality. This 
hypothesis was based on the psychoanalytic 
conception (Schafer, 1954) of anality as in- 
volving reaction formation against hostility. 
Schafer hypothesized that this personality 
constellation leads to a tendency to empha- 
size the positive, healthy aspects of patients 
and to deemphasize pathology. This hypoth- 
esis was confirmed at the .01 level of signifi- 
cance by Gordon (1966). 


General Score 


Subjects low on anality have lower general 
scores than subjects high on anality. This 
hypothesis was based on the psychoanalytic 
conception (Fenichel, 1945) of anality as in- 
volving tendencies toward generalizations and 
failures to make commitments. This hypoth- 
esis was confirmed at the .05 level of signifi- 
cance by Gordon (1966). 


Interactions 


The above hypotheses were expected to 
apply to high-anal clinicians as compared to 
low-anal clinicians, regardless of whether they 
were confronted with a high- or low-anal pa- 
tient. In the previous study, the anality of 
the patient was not at issue and was not 
varied, and the hypotheses were confirmed. 
However, in the current study, it was pre- 
dicted that the transcript of the high-anal 
patient would arouse the high-anal clinician, 
thus accentuating his anal behavior. The low- 
anal clinician was not expected to be differ- 
entially aroused in response to the two dif- 
ferent types of patients. Thus, the following 
hypothesis was derived: There are significant 
interactions between anality in the patient 
and anality in the clinician on the response 
measures referred to in the above three hy- 
potheses. 


; MetHop 
Subjects 
The Ss were 44 clinicians and trainees from the 


doctoral and postdoctoral programs of New York 
University who were working in hospitals and pri- 
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vate clinics in New York City. They came from a 
variety of training backgrounds and represented 
experience levels of 1-9 years. The median level of 
clinical experience was approximately 3 years, The 
Ss were volunteers and were given the experimental 
materials which they took home with them and 
returned to the investigator at their leisure. 


Materials 

Anality measure. The Ss were divided into high- 
and low-anal groups on the basis of the Dynamic 
Personality Inventory (Grygier, 1956). Four scales, 
Ad, Ah, Ac, and Aa,1 were used, and Ss were given 
total anality scores based on the sum of the four 
scales. (For the rationale for this particular use of 
the inventory, see Gordon, 1966.) The median score 
was used as the cutting point for dividing Ss into 
the high- and low-anal groups. 

Interpretive Decision Scale. The revised scale con- 
sists of 40 interpretive statements about a hypo- 
thetical patient. Examples are, “The patient is per- 
ceptive and observant,” “The patient has temper 
tantrums,” “The patient is shy,” and “The patient 
is the life of the party.” The scale is presented along 
with clinical data about a particular patient (in 
this case, one of the two transcripts was presented). 
Instructions require Ss to check those statements 
which might be true of the patient, to put two 
checks in front of those about which they are more 
confident and which are probably true of the pa- 
tient, and to put three checks in front of those about 
which they are most confident and which are al- 
most definitely true of the patient. The confidence 
score is the number of items given three checks. 
Items in the scale are also classified into positive 
and negative statements and general and specific 
statements. Placement into the categories was based 
on the unanimous judgment of four graduate stu- 
dents in psychology. There are, thus, four kinds of 
items: (a) positive-general, such as, “The patient is 
intelligent,” (b) positive-specific, such as, “The pa- 
tient likes children,” (c) negative-general, such as, 
“The patient is a repressed person,” and (d) nega- 
tive-specific, such as, “The patient has difficulty 
sleeping.” The positive score is the proportion of 
positive items of the total number of items checked, 
and the general score is the analogous proportion 
for general items. Pilot studies led to the develop- 
ment of a 40-item scale with 10 of each of the 
four types of items. 

Transcripts. The transcripts were not actual tran- 
scripts of initial interviews, but were written Ea 
cifically for this study. They were equal in ite 
(approximately 700 words) and were designed i 
be as equivalent as possible on all variables orate 
anality. Thus, the ficticious patient describes simita 


1The four scales used in this experiment is 
described by Grygier as follows: Ah hoard ee 
havior, anxious possessiveness, and stubborn, Cie 


ing persistence; Ad—attention to details, ona 
ness, conscientiousness, and perfectionist; Gag 


conservatism, rigidity, tendency to stick to r 
Aa—submissiveness to authority and order. 
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DECISION MAKING IN A CLINICAL SETTING 


TABLE 1 


ANALYSIS OF VARIANCE FOR CONFIDENCE, POSITIVE, 
AND GENERAL SCORES 
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TABLE 2 


ANALYSIS OF VARIANCE FOR Sickness, LIKABILITY, 
AND Anatity RATINGS 


F F 
Source af 
Confidence | Positive | General Source al. Lika- s 
score score score Sioen ines bilit; sey: 
rating papal tating 
Clinician anality | 1 6.33* 13.75*** | 3.81 — e 
(A) Clinician analit 1| 16.19*** | 11.62** 
Patient anality | 1| .37 | 625* | 190 (A) . ; mf 
(B) Patient anality (B) | 1] 2.64 98 | 34,45*** 
AXB 1 91 3.13 .00 AXB 1] 3.63 .98 .90 
Within groups | 40 = = — Within groups 4| — — — 
*p <05. 
=p $ 00i. ee S001 
symptoms (anxiety, sleeping difficulties, quarrels RESULTS 


with his wife) on both transcripts; the major dif- 
ference between the two is that in the high-anal 
transcript, the patient’s difficulties are expressed in 
terms of the psychoanalytic conceptualization of the 
defensive style of the anal character (vacillation, 
obsessive thinking, intellectualization, etc.). In addi- 
tion, the high-anal transcript includes one mention 
of anal content (the symptom of constipation). 
Pilot studies showed the two transcripts to be 
judged equivalent on such variables as the patient’s 
likability, intelligence, stability, social-class affilia- 
tion, and warmth. 

Additional response measures. The Ss were asked 
to rate the patient on seven-point scales for the 
following variables: need for therapy, degree of ad- 
justment, likability, and anality. Ratings of need 
for therapy and degree of adjustment were com- 
bined into a “sickness score,” and this score and 
the likability score were analyzed as additional 
data bearing on the hypothesis concerning the posi- 
tive score. If reaction formation and the tendency 
to see the patient as more healthy are characteristic 
of the high-anal clinician, it would be expected 
that he would rate the patient lower on the sick- 
ness score and higher on the likability score, and 
Significant interactions between patient and S anal- 
ity would be found. The anality rating was in- 
cluded as an independent check of the degree of 
anality presented in the two transcripts. 


Procedure 


The Ss took the DPI several weeks prior to the 
Presentation of the transcript, the IDS, and the 
Tating scales. They were not informed of any rela- 
tionship between the two parts of the experiment. 

he Ss were divided into high- and low-anal groups 
(above and below the median of 13.5 for the four 
Seales) and were randomly assigned to the high- 
and low-anal patient condition. ‘3 

he results were analyzed by 2X2 factorially 
Organized analyses of variance; separate analyses 
Were done for each of the response measures. 


Confidence Scores 


The hypothesis concerning the confidence 
score was confirmed at the .025 level of sig- 
nificance (see Table 1). Low-anal clinicians 
had higher confidence scores than high-anal 
clinicians. 

Positive Score 

The hypothesis concerning the positive 
score was confirmed at the .01 level of sig- 
nificance (see Table 1). Low-anal clinicians 
had lower positive scores than high-anal cli- 
nicians, 

General Score 

The test of the hypothesis concerning the 
general score approached statistical signifi- 
cance at the .10 level of confidence (Table 
1). Low-anal clinicians showed a tendency 
toward lower general scores than high-anal 
clinicians. 

Interactions 

The hypothesis concerning interactions be- 
tween anality in the patient and anality in 
the clinician was not confirmed in relation to 
the confidence or general scores. The test of 
the hypothesis concerning interactions in rela- 
tion to the positive score approached signifi- 
cance at the .10 level of confidence (Table 1). 


Additional Findings 
High-anal clinicians rated the patient as 
more likable and less sick than low-anal cli- 
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nicians (see Table 2). These findings may be 
considered additional confirmation of the hy- 
pothesis concerning the positive score. There 
was also a tendency toward a significant in- 
teraction ($ < .10) between anality in the 
patient and anality in the clinician on the 
sickness rating. 

The independent check of the anality ma- 
nipulation in the transcript was significant at 
the .001 level. (Table 2). 


DISCUSSION 


This study clearly confirms the major find- 
ings of Gordon (1966). Both studies found 
that high-anal clinicians have less confidence 
in their clinical interpretations, make fewer 
specific predictions, and find less pathology 
in their patients than low-anal clinicians. 
These findings confirm the psychoanalytic 
view of anality and the psychoanalytic theory 
of interpretation expressed by Schafer (1954). 
The generalization value of the results is in- 
creased by the fact that two studies using 
different kinds of clinical data led to essen- 
tially the same findings. 

The present study, in addition, was de- 
signed to study the effects of interactions be- 
tween patient anality and clinician anality on 
the decisions of the clinician. With the possi- 
ble exception of the positive score, hypothe- 
ses related to interactions between the per- 
sonality of the patient as presented in the 
clinical data and the personality of the cli- 
nician were not confirmed. 

There are two possible explanations for this 
failure of confirmation. First, it is possible 
that the anality manipulation was not suffi- 
ciently arousing to result in measurable ac- 
centuation of anal defenses in high-anal cli- 
nicians. This could be true in spite of the 
fact that the independent check of the anality 
manipulation in the patient (anality ratings) 
showed that degree of anality was varied. 
Follow-up studies should provide a more 
arousing anality condition, and the experi- 
menter is planning a series of studies provid- 
ing such a condition by changes in the form 
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and the content of the manipulation. The ac- 
tual taped interview will be played in the 
presence of the clinician and the experimenter, 
and the content of the tape will be revised to 
include what pilot studies show to be more 
arousing content. 

The second difficulty in the research would 
seem to be related to the failure of the IDS 
to measure small, subtle differences in anal- 
defensive behavior on the part of the clini- 
cians. Assuming, as the earlier study reason- 
ably permits one to do, that high- and low- 
anal clinicians differ in their habitual decision- 
making tendencies, failure to find significant 
interactions with variations in experimentally 
aroused anality may simply reflect the fact 
that high-anal subjects did not have enough 
“room” within the confines of the 3-point 
scale to show a significant increment in their 
normally extreme scores. In order to take 
this difficulty into account, the investigator 
is currently revising the IDS to provide a 
more sensitive response measure (i.e, a 7- 
point scale instead of a 3-point scale). The 
change in the response measures together with 
the improvement in the arousal manipulation 
should provide a more powerful test of the 
hypotheses concerning interactions between 
the personalities of the patient and the clini- 
cian. If studies using the improved method 
do not lead to confirmation of the interaction 
hypotheses, it may be necessary to reevaluate 
the theoretical formulations which led to the 
prediction that such interactions would occur. 
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4 high- and 4 low-functioning clients were exposed to a clinical interview con- 
ducted by 1 high-functioning counselor and 1 moderate-functioning counselor, 
During the middle period of the interview, the counselors lowered their levels 
of the facilitative conditions of empathy, positive regard, genuineness, con- 
creteness, and self-disclosure. The depth of self-exploration of the low-function- 
ing clients was found to be a function of counselor-offered conditions, while 
depth of self-exploration of the high-functioning clients was found to be 
relatively independent of counselor. conditions when seen by the high-function- 
ing counselor. All clients improved in their levels of self-exploration with the 
high counselor, and all clients tended to decline in level of self-exploration 
when seen by the moderate-functioning counselor. Interactions among major 
variables are presented and implications considered. 


Truax and Carkhuff (1964) have brought 
together a body of evidence to suggest that 
therapist-offered levels of empathy, respect, 
genuineness, concreteness, and self-disclosure 
are related to client depth of self-exploration. 
These authors concluded that the evidence 
Suggests that it is the counselor who deter- 
mines the level of therapeutic conditions and 
that both the counselor and client contribute 
to the client’s depth of self-exploration. In 
an experimental approach to the question of 
causation, Truax and Carkhuff (1965) manip- 
ulated the level of therapist-offered conditions 
and found the depth of self-exploration of 
three female psychotics to be a function of 
the level of conditions provided by the coun- 
Selor. The communication process, however, 
has broken down for psychotics and is char- 
acterized as operating at the lowest levels of 
facilitative interpersonal dimensions. It then 
seemed reasonable to expect that low-function- 
mg clients would be almost completely de- 
Pendent upon the counselor in the verbal ex- 

‘anges in counseling and, hence, explore 
themselves deeply only when the counselor is 
Providing high levels of therapeutic condi- 
tions, Holder, Carkhuff, and Berenson (1967) 
report a study to determine the effects of the 


manipulation of therapeutic conditions upon 
the depth of self-exploration of clients func- 
tioning at high levels and clients functioning 
at low levels of empathy, respect, genuineness, 
and concreteness. In that study, the authors 
cast 11 college students in the helping role, 
The three functioning at the highest levels of 
therapeutic conditions and the three at the 
lowest were selected to participate as “clients” 
in a counseling project in which, unknown to 
the clients, the counselor offered high levels 
of conditions during the first 20 minutes, low 
conditions during the middle period, and high 
conditions during the final period. The depth 
of self-exploration of the low-functioning cli- 
ents was found to be a function of the level 
of counselor-offered conditions, while the high- 
functioning clients continued their initial 
level of self-exploration independent of the 
level of conditions offered by the counselor, 
and this level was significantly higher than 
that of the low-functioning clients. The pres- 
ent study was designed to replicate and elab- 
orate on the Holder, Carkhuff, and Berenson 
(1967) study by more rigorously assessing 
interactions emerging from manipulating lev- 
els of therapeutic conditions by high- and 
moderate-functioning counselors upon the 
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depth of self-exploration of clients function- 
ing at high and low levels of empathy, re- 
spect, genuineness, and concreteness. 


METHOD 


Eighteen female college students were cast in the 
helping role of a counselor and given a set to be as 
helpful as they could in helping a standard inter- 
viewee work through his problems. The four highest 
functioning students and the four lowest functioning 
students were selected to participate as clients in 
the counseling project. The Ss reported having no 
difficulty changing their sets from that required in 
the “helping role” to that of a client-subject. Un- 
known to each client, two experienced counselors, 
one functioning at high levels of facilitative condi- 
tions and one functioning at moderate levels, as 
assessed and based upon previous research, offered 
each of the eight Ss high levels of conditions during 
the first third of a clinical interview, low levels 
during the middle 20 minutes, and reinstated high 
levels of conditions again during the last third of 
the interview in the manner of earlier research 
(Holder, Carkhuff, & Berenson, 1967; Truax & Cark- 
huff, 1965). The counselors presented themselves as 
trying to offer as much help as possible in the time 
allotted. The counselors did not lower conditions by 
being negative in their regard or destructive. Dur- 
ing the experimental manipulation period, the coun- 
selors attempted simply to withhold their best pos- 
sible therapeutic responses. Each counselor attempted 
to standardize the introduction to the initial and 
third periods and attempted to continue to make 
as many responses during the middle period as he 
did during the other periods. 

Selection of Ss was based on random excerpts 
taken from the tape-recorded sessions and rated by 
experienced raters on five-point scales (Carkhuff, 
1967) assessing the following dimensions of func- 
tioning which have been related to constructive 
change in psychotherapy and counseling: counselor 
empathy (E), respect (R), genuineness (G), con- 
creteness (C), self-disclosure (SD), and the degree 
to which the S explores himself (Ex). E ranges 
from Level 1, in which the counselor is unaware or 
ignorant of even the most conspicuous surface feel- 
ings of the counselee, to Level 5, in which the coun- 
selor communicates an accurate and empathic under- 
standing of the client’s deepest feelings. R ranges 
from the counselor’s clear communication of nega- 
tive regard to his communication of a deep caring 
for the client. G varies from the communication of a 
wide discrepancy between the counselor’s experience 
and his verbalizations to his being freely and deeply 
himself in the relationship, C ranges from vague and 
abstract discussions to the direct discussion of spe- 
cific feelings and experiences. SD moves from Level 
1, in which the counselor actively attempts to re- 
main detached from the client and discloses nothing 
about himself, to Level 5, in which the counselor 
freely volunteers, with appropriate discriminations, 
information about his personal ideas, attitudes, and 
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experiences. Ex ranges from the lowest level, in 
which the client does not explore himself at all, to 
the highest level, in which the client is searching to 
discover new feelings concerning himself and his 
world. Pearson r rate-rerate reliabilities for the two 
raters involved in this study were as follows: E, .94, 
91; R, .93, .90; G, .90, 89; C, .89, .87; SD, .91, 
.-93; Ex, .90, .95. The intercorrelations between the 
raters were as follows: E, .98; R, .99; G, 83; C, 
.91; SD, .91; Ex, .78. 

The mean performance of the “high-client” group 
was as follows: E, 2.88; R, 3.19; G, 3.30; C, 3.19; 
SD, 3.00. The mean performance of the “low-client” 
group was assessed at the following levels: E, 1.27; 
R, 1.47; G, 1.53; C, 1.75; SD, 1.38. All high-rated 
Ss were functioning well above Level 2 on all di- 
mensions, and all of the low-rated Ss were function- 
ing below Level 2 on all dimensions when cast in 
the helping role of a counselor. This was true even 
though both groups of Ss worked with standard 
interviewees who readily shared personal problems 
and experiences. The high group elicited an average 
Ex level of 2.50, while the low group elicited an Ex 
of 2.30. 

In previous research, the high-functioning coun- 
selor involved in this study had been found to be 
functioning at the following average levels of facili- 
tative conditions: E, 3.75; R, 3.60; G, 3.60; C, 3.40; 
SD, 3.25. The moderate-functioning counselor had 
been found to be functioning at the following average 
levels of facilitative conditions: E, 2.10; R, 2.20; G, 
2.07; C, 2.00; SD, 1.70. Both counselors were male. 
The high-functioning counselor reported having had 
7 years of experience and the moderate-functioning 
counselor 6 years. 

Both therapists had received instructions and 
training in the proper manipulation of facilitative 
conditions, During the first 15 minutes of each inter- 
view they were told to perform to their full capacity 
and be as helpful as possible. During the second 15 
minutes, the manipulation stage, both therapists at- 
tempted to lower the quality of conditions they 
offered, not by being completely and openly disre- 
spectful or incongruent, but simply by withholding 
what they considered to be their “best” responses. 
During the final minutes of the interview the thera- 
pists once again endeavored to be as helpful as they 
could. The therapists were not aware of the goals 0! 
the study. $ 

The design was counterbalanced: Two high and 
two low Ss saw the high-functioning therapist first, 
with the remaining Ss being first seen by the mod- 
erate-functioning therapist. All Ss saw both coun- 
selors on the same day. 

Each phase of the experimental interviews was 
divided into five 3-minute periods which were then 
randomly presented for rating to the trained raters 
The tapes were evaluated in terms of client idep i 
of self-exploration. In addition, the therapists per- 
formances were checked by rating the level of c 
ditions they offered to the Ss. This was done 
provide a check on the experimental operations Da 
to make possible various secondary comparisons 
tween therapists and clients. 


MANIPULATION or THERAPEUTIC CONDITIONS 


TABLE 1 


ANALYSIS OF VARIANCE CLIENT DEPTH OF 
INTRAPERSONAL EXPLORATION 


Source df MS F 
P 1 38.600 | 28.532*** 
Q 1 3.444 | 2.546 
PO i 15.377 | 11.366** 
S/PQ 4 1.353 
is 1 1282.282 | 6.036* 
PT 1 527 117 
or 1 625 139 
POT 1 7.615 1,692 
ST/PQ 4 4.501 
M 2 2.795 | 5.217** 
MP 2 1.079 | 2.015 
MQ 2 .095 ATT 
MPO 2 .490 916 
MS/PQ 8 .536 
MT 2 7.707 | 28.599**#* 
MPT 2 340 | 1.269 
MOT 2 .189 699 
MPTQ 2 1.050 3896 
MST/PQ 8 270 
c 4 022 375 
cP 4 .093 1,608 
co 4 .020 349 
CPO 4 .076 1.311 
CS/PQ 16 1058 
cr 4 329 | 3.189** 
CPT 4 046 227 
cor 4 .235 2.284 
CPOT 4 .040 388 
CST/PQ 16 .103 
CM 8 141 1,094 
CMP 8 046 353 
CMQ 8 .037 .287 
CMPQ 8 079 609 
CMS/PQ 32 .130 
CMT 8 116 | 1.630 
CMPT 8 .040 556 
CMOT 8 .046 .642 
CMPTO 8 .156 | 2.194 
CMST/PQ 32 071 


Note.—Abbreviated: P = patient level; Q = sequence. 


giseti S = subjects; ; M = ipulation 
i S = subjects; T = therapist level; M = manipula 
ERA 2 repeated measures within manipulation stages. 


"p <05. 
we O 
RESULTS 
Figure 1 represents a comparison of per- 
formances of the high-functioning therapist 
Therapist H) and the moderate-functioning 
therapist (Therapist M) during each of the 
€e experimental interview periods. The fol- 
lowing results are of interest. z 
1. As predicted, the level of conditions 
Offered by Therapist H during the first and 
al interview periods was higher ($ < .01) 


Level of Functioning 


Fic. 1. Overall therapist functioning: Therapist 
Level X Manipulation Stage. 


than that offered by Therapist M. During the 
second, or manipulation stage, the higher 
levels obtained by Therapist H were not sig- 
nificant. 

2. Both therapists were successful in re- 
ducing the level of conditions they offered 
their clients during Stage 2 (Therapist H, p 
< .001 on all indexes; Therapist M, p’< .05 
on all indexes) and in reestablishing (Thera- 
pist M) or surpassing (Therapist H) original 
conditions levels during Stage 3. 

3. The level of therapeutic conditions of- 
fered was not significantly affected by client 
level for either therapist. 

The data on client depth of self-exploration 
are presented in Figures 2 and 3. Analysis of 
variance procedures were applied to the two- 
between, three-within design (Table 1), and 
when further clarification was necessary ¢ 
tests between means were used. 

Two significant main effects were exhibited 
by this data: manipulation stage (M, p < 
.05) and patient level (P, p < .01). In addi- 


~ Therapist H 
as ———-——- Thorspiot M 


Level of Functioning 


Fic, 2. Client depth of self-exploration: Therapist 
Level X Manipulation Stage. 
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Period I z mr 
Fic. 3, Client depth of self-exploration: Therapist Level X Patient Level X Manipulation Stage. 


tion, therapist level (T) was found to exert a 
significant effect during Stage 3 (p < .001). 
Significant and interpretable interactions were 
as follows: Therapist Level x Manipulation 
Stage (T x M, p< .001); Therapist Level 
X Repeated Measures within Manipulation 
Stages (T X C, p< .05); and the Patient 
Level X Sequence Effect (P x Q, $ < .05). 
Finally, a high within-cell variance may have 
masked the significance of an interaction be- 
tween therapist level, patient level, and se- 
quence effects, 

In summary, the following effects were indi- 
cated: 

1. There is a direct relationship between 
the quality of therapeutic conditions offered 
by a given therapist and the depth of intra- 
personal exploration attained by his client (a 
main T effect during Stage 3). 

2. There is a direct relationship between 
the level of conditions offered by subjects 
when cast in a helping role and the level of 
intrapersonal exploration they attain as 
clients, independent of the quality of external 
supervision (a main P effect). 

3. When Therapists H and M experi- 
mentally lowered the level of conditions of- 
fered a client, there was a corresponding drop 
in client depth of intrapersonal exploration 
(a main M effect). This effect was not ex- 
pected in the case of Therapist H with high- 
functioning clients and the trend was not 
statistically significant. 
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Therapist M 


4. When Therapist H reinstated high levels 
of conditions during Stage 3, there was a 
consequent rise in client depth of self- 
exploration. It should be noted that the qual- 
ity of Therapist H’s performance during 
Stage 3 was substantially better than in 
Stage 1. Though Therapist M was able to 
reestablish moderate levels of conditions dur- 
ing Stage 3, client depth of self-exploration 
did not rise accordingly, but rather continued 
to decline (a significant M x T interaction). 
This effect appeared to be independent of 
client level of functioning (see Figures 2 
and 3). 

5. The sequence in which clients were ex- 
posed to Therapists H and M in part deter- 
mined the quality of client performance dur- 
ing their later interview. The performance of 
low-functioning clients seemed to be affected 
more than that of high-functioning clients. 

6. Within each manipulation stage, the 
performance of clients interacting with Thera- 
pist H gradually improved (as did the per 
formance of Therapist H), while the perform: 
ance of clients exposed to Therapist 
gradually deteriorated (as did the perform- 
ance of Therapist M). This effect was ap- 
parently independent of manipulation Spek 
(a significant T XC effect). Though otk 
tistically significant, the actual changes a 
were slight and may represent an artifact 1 
the data. 


MANIPULATION oF THERAPEUTIC CONDITIONS 


Discussion 


It appears that a therapist capable of of- 
fering relatively high levels of therapeutic 
conditions and a therapist capable of offering 
only moderate levels of these conditions had 
differential effects upon the depth of intra- 
personal exploration attained by a group of 
subjects during simulated counseling inter- 
views: Three points are worthy of note: 

1. The differential nature of the therapist’s 
effectiveness appears to have been, in part, 
dependent on both therapist and client level 
of functioning and on the interaction between 
them. That is to say, high-functioning clients 
were most facilitated by Therapist H and 
low-functioning clients were least facilitated 
by Therapist M. 

In general, temporary variation in the per- 
formance level of the moderate-functioning 
therapists may be of greater consequence to 
the counseling relationships than similar vari- 
ation in the performance level of the high- 
functioning; the reinstatement of high levels 
of therapeutic conditions led to deeper self- 
exploration, irrespective of client level. How- 
ever, the downward manipulation of condi- 
tions by Therapist M seemed to initiate a 
degeneration of the communication process 
which was not overcome even by the rein- 
statement of conditions. It may be that this 
occurrence is not even a function of condition 
manipulation, but occurs naturally in all 
therapeutic relationships in which the thera- 
pist does not function at a minimally facili- 
tative level. However, there is evidence to 
indicate that under optimal conditions the 
Moderately able therapist was able to elicit 
at least moderately deep levels of exploration 
in his clients. A more tenable hypothesis may 
be that under ideal conditions such a thera- 
Pist is indeed capable of attaining moderate 
levels of success, but his established connec- 
tion with his patient may be a basically 
Superficial one. The high-functioning thera- 
Pist seemed to impart enough faith and/or 
trust in the therapeutic encounter and in 
himself during initial contact and to provide 
enough motivation toward self-exploration to 
maintain the relationship in the face of 
temporary periods of lessened facilitation. 
Tn addition, the high therapist might have 
Tecognized temporary breakdowns in the 
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communication process more quickly and 
easily and, thus, have been better equipped 
to explore them openly with his client in an 
attempt to uncover their causes. 

2. The existence of a significant (p < .05) 
Patient Level X Sequence interaction sug- 
gests that the exposure of a subject to one 
interview had modifying effects on his behav- 
ior during the following interview, and that 
the nature of this effect was in part dependent 
on his assessed level of functioning. The data 
indicate that high-functioning subjects who 
saw Therapist H before Therapist M (P101) 
functioned at a slightly lower level than high 
subjects who saw Therapist M before Thera- 
pist H (P:Q2). This may simply reflect a 
small practice or adjustment effect, where 
high subjects who were already familiar with 
the experimental environment and procedure 
were able to respond more freely and fully 
to the facilitative nature of Therapist H. 
However, low-functioning subjects who were 
exposed to Therapist M first (P2Qs) per- 
formed at substantially lower levels than did 
similar subjects who first saw Therapist H 
(P201). The two possible explanations for 
this occurrence are not mutually exclusive 
and, in fact, are probably equally valid. First, 
the PsQ, group may have experienced a 
slight carry-over or “halo” effect, a pleasant 
interaction with Therapist H leading to the 
anticipation of a similar experience with 
Therapist M, and a consequent attribution of 
the former’s characteristics to the latter. 
Second, the initial exposure of low subjects to 
Therapist M may have so disappointed, dis- 
illusioned, or bored them that they became 
somewhat unresponsive to any attempts at 
facilitative communication by Therapist H. 
Looking at P X Q, Therapist Level x Patient 
Exposure to Therapist Level was nonsignifi- 
cant, but its effect may have been masked by 
the presence of Subject X Therapist Level 
within-cell variance. Therapist H elicited 
relatively similar depths of self-exploration 
from both high and low subjects. The self- 
exploration of high subjects seems to have 
been elevated and that of low subjects de- 
pressed by previous exposure to Therapist M. 
There is little or no such interaction present 
at Tə: Both high and low subjects explore 
themselves more deeply if they have first been 
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exposed to Therapist H. Note again that 
there is no negative carry-over effect in evi- 
dence between Therapist M and high-func- 
tioning clients. The subjects were actually 
able to benefit from the experience later, 
when in the presence of a high-functioning 
therapist. Low-functioning clients, however, 
appear to have been susceptible to the 
deteriorative consequence of an encounter 
with Therapist M. 

Essentially, these data suggest that a “set 
effect” is produced in a client or patient dur- 
ing the course of an initial therapeutic en- 
counter which may be either positive or 
negative, depending on the quality of that 
encounter. These sets are undoubtedly tempo- 
rary and easily overcome by continued ex- 
posure to therapy of a different quality. 
Nevertheless, the existence of such factors 
may have implications for hospitals, counsel- 
ing centers, and other institutions where 
clients (or students) are seen by more than 
one therapist (or teacher). 

3. The present results support earlier find- 
ings (Truax & Carkhuff, 1964) that the level 
of conditions offered by the therapist, at least 
during the initial interview, is determined by 
the therapist and not by the patient. How- 
ever, in that study patient level was em- 
ployed as a static between-subjects variable. 
Present findings do not preclude the pos- 
sibility that longer exposure of patient to 
therapist and/or variations in patient level of 
functioning may have some effect on thera- 
pist behavior. For example, Alexik and Cark- 
huff (1966) suggest that such an effect does 
exist and is in part dependent on therapist 
level of functioning. Possibly, the inability 
of Therapist M to reinstate condition levels 
during Stage 3 commensurate with the im- 
provement demonstrated by Therapist H was 
in part due to the reciprocal effects of de- 
creases in the quality of his clients’ perform- 
ance during Stage 2. Further research into 
the relationships between these variables is 
indicated. 

Summarily, the results of this study indi- 
cate that high-functioning and moderate- 
functioning therapists may have differential 
effects on the depth of client self-exploration. 
In addition, these effects may vary with re- 
spect to such factors as client level of func- 
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tioning, stability of client level of functioning, 
situational changes in the level of therapist 
functioning, and the client’s previous experi- 
ence in and/or attitude toward therapy. 
Unique effects due to the interaction of two 
or more of these factors must also be 
considered. 

Perhaps the most significant contribution 
of the present study is the indication that 
the quality of psychotherapy can be arti- 
ficially manipulated in a controlled environ- 
ment by therapists of differing skills, with 
measurably differential and predictable re- 
sults. The manipulation of therapeutic condi- 
tions is assumed to simulate situational varia- 
tions which of necessity occur in the facili- 
tative behavior of all therapists, good and 
bad, from time to time during the counseling 
or therapy session. It is the effect of this 
unpredictable or uncontrolled behavior upon 
the client which has been under analysis 
here. Beyond this, the findings expand those 
reported by Holder, Carkhuff, and Berenson 
(1967) by demonstrating differential effects 
and interactions when employing moderate- 
and high-functioning therapists. Within the 
limits of the present study it is suggested 
that future research examine in more detail 
the conditions under which some individuals 
recover from such periods. These studies 
might include manipulation periods over 
longer periods of time and a wider variety of 
therapists and clients. 


REFERENCES 


Acexix, M, & Carxnurr, R. R. The effects of the 
experimental manipulation of client self-explora- 
tion upon high- and low-functioning therapists. 
Journal of Clinical Psychology, 1967, in press. 

Carknurr, R. R. The counselors contribution to 
facilitative processes. Urbana, Til.: Parkinson, 1967. 

Horner, T, Carxuurr, R. R, & Berenson, B. 
The differential effects of the manipulation of 
therapeutic conditions on high- and low- 
functioning clients. Journal of Counseling Psy- 
chology, 1967, 14, 63-66. 

Truax, C. B., & Carxuurr, R. R. For better oF for 
worse: The process of psychotherapeutic pegea, 
ality change. In B. T. Wigdor (Ed.), Recent at- 
vances in the study of behavior change. Montreal: 
McGill University Press, 1964. Pp. 118-163. tal 

Truax, C. B., & Carknurr, R. R. Experimen! j 
manipulation of therapeutic conditions. Journal 0 
Consulting Psychology, 1965, 29, 119-124. 


(Received December 26, 1966) 


| 


mal. of Consulting Psychology | 
in Vol. 31, No. 5, 487-491 
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DIFFERING IN PREDISPOSITION TO 
DAYDREAMING * 
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Ss representing extremes on questionnaires of prior disposition to daydreaming 
frequency and thoughtfulness also differed in reports of task-irrelevant imagery 
during rapid-rate auditory signal-detection sessions. While high daydreamers 
showed a significant performance decrement over time in general, they did not 
show significantly more detection errors than did low daydreamers. Results 
suggest that data obtained from questionnaire responses are relevant to per- 
formance in an experimental situation and also support a model relating day- 
dreaming trends to certain patterns of preference for internal or external 
stimulation even under relatively demanding and alerting conditions of 


rapid signal presentation. 


The investigation described here represents 
one outgrowth of a program of research ex- 
ploring the dimensions and functional role 
of man’s spontaneously generated cognitive 
processes—his daydreams, reveries, and gen- 
eral “stream of consciousness” (Singer, 1966). 
Early work in this area emphasized explora- 
tion of the behavioral correlates and under- 
lying theory of Rorschach’s Human Move- 
Ment (M) response as one index of fantasy 
tendencies. Because of the complexity of the 
M response and the psychometric limitations 
of Rorschach data, more direct approaches to 
establishing evidence of daydreaming tenden- 
cles have been used in some phases of recent 
experiments, specifically, questionnaire and 
Interview techniques (Singer & Antrobus, 
1963). It should be noted that Page (1957), 
Using a daydream questionnaire similar to 
the one employed in these studies, found that 
M alone, of all Rorschach determinants, was 
Significantly related to self-reports of day- 
dreaming frequency. More recently Schonbar 
(1965) found that M threshold on the Barron 
inkblots was significantly lower for persons 
Teporting more frequent recall of night 
dreams, a measure that had earlier been 
Shown to be associated with the frequency 


*A portion of the work done in this study was 
Supported under Grant M-10956 from the National 
Institute of Mental Health, United States Public 
Health Service. 


* Now at Temple University. 


of daydreaming as measured by a question- 
naire (Singer & Schonbar, 1961). 

The present experiment attempts to study 
predisposition to daydreaming within an ex- 
perimental framework that, at first, seems far 
removed from ordinary clinical application of 
fantasy material. It is proposed that day- 
dreaming represents a widespread normal hu- 
man phenomenon, with differential develop- 
ment as a function of early experience and 
practice and with differential expression as 
a result of the relative pressures of 
environmental stimulation (Singer, 1966), 
Recent studies of vigilance and signal- 
detection behavior have opened up new theo- 
retical conceptions of man’s capacities as an 
information-processing organism (Broadbent, 
1958; Bruckner & McGrath, 1963). The 
tendency to respond to “inner” or “external” 
channels or stimulus complexes is one way 
of looking at the daydreaming patterns of 
the introversive experience type so often re- 
ported in Rorschach and personality research. 
The model presented here as well as the 
study described exemplify the application of 
the recent experimental approaches to study- 
ing the relatively evanescent and fluid 
experiences of man’s ongoing inner processes. 


EXPERIMENTAL MODEL AND HYPOTHESES 


Most investigators studying man’s signal- 
detection capacity, ability to remain alert or 
vigilant during long “watches” in understimu- 
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lated environments, have focused on external 
parameters such as signal modality, fre- 
quency, duration of watch or pattern of 
stimulus presentation. Important individual 
differences in response have been noted, how- 
ever. Broadbent (1958) has called for in- 
creased attention to individual differences 
during signal detection and vigilance behav- 
ior, with specific reference to the degree to 
which self-mediated responses may either 
maintain arousal in understimulated tasks or 
provide response competition in a fairly stimu- 
lating task. Bakan, Belton, and Toth (1963) 
found that introverts (Maudsley Scale) 
showed significantly less decrement in a pro- 
longed auditory vigilance watch apparently 
because there were “reinforcers other than 
signal occurrence” which maintained arousal. 
Experimental work by Antrobus and Singer 
(1964), using the subject’s own free associ- 
ative speech as the independent variable, 
Suggests that varied internally generated re- 
sponses may sustain alertness over long peri- 
ods in “understimulated” monitoring tasks. 
These varied inner responses, analogous to 
daydreaming, may also compete with external 
signals and cause relatively inferior perform- 
ance when tasks are of shorter duration, more 
demanding, or when arousal is maintained by 
external means, such as rousing brass-band 
march music. 

The present experiment studied the effect 
of spontaneous covert thought processes, for 
example, daydreaming, on signal-detection 
performances. The frequency of spontaneous 
thought processes carried on during the detec- 
tion task was varied by preselecting subjects 
who reported characteristically high and low 
frequencies of daydreaming and absorption in 
their own thoughts while in their normal 
environment. It was predicted that, the high 
daydreamer group would report more task- 
irrelevant thoughts during the performance 
of the task and would make more detection 
errors. The relatively high signal density and 
rapid rate of stimulus presentation employed 
in the present experiment suggest that the 
task itself is sufficiently stimulating to main- 
tain subjects at their normal waking level of 
arousal. Individual differences in internally 
produced stimulation should not, therefore, 
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contribute significantly to individual varia- 
tions in arousal for this experiment. 

In effect the experimental model employed 
proposes that normal adults have a limited 
capacity for “operating? on external and 
internal stimulus events. Under most circum- 
stances there is a slightly greater tendency 
to react to the external environment. (Com- 
pare Rapaport’s, 1960, p. 233, formulation of 
Freud’s attention-cathexis theory and his 
conjecture that “attention cathexes have a 
permanent gradient towards the external 
excitations.”) When external stimulation is 
markedly reduced (as in sensory-deprivation 
conditions) or monotonous (as in the long 
“watches” of a vigilance experiment), there 
may be an upsurge of awareness of the on- 
going spontaneous activity of the brain and 
an increase in free associative thought or 
daydreaming. In addition, it is proposed that 
longstanding practice and other individual 
motivational factors may predispose some per- 
sons to assign higher priority to “operating” 
on internal channels under ordinary circum- 
stances. In experimental situations involving 
rapid signal detection, these relatively inner- 
oriented persons should report relatively 
more thought that is unrelated or irrelevant 
to the detection task; in addition, because 
of the limited capacity of their “cognitive 
operators,” they may show more errors in 
detections due to greater attention to “internal 
signals” such as daydreams. 


METHOD 
Subjects 


College students who scored at the extremes of 
the distribution of scores on two scales, the General 
Daydreaming Questionnaire (Singer & Antrobus, 
1963) and the Thoughtfulness subscale of the 
Guilford-Zimmerman Temperament Survey (Guil- 
ford & Zimmerman, 1949) were employed in the 
study. The Daydreaming scale calls for reports of 
frequency of occurrence of a wide variety of day- 
time fantasies, while the Thoughtfulness scale reflects, 
among other things, “interest in thinking versus . - - 
overt activity.” The Thoughtfulness scale represents 
a measure of thinking introversion, interest in ideas 
and concepts, and emphasis on ideation rather than 
action. It is distinct in Guilford’s research from 
social introversion. Combining the extreme Ss from 
both the Daydreaming and Thoughtfulness scales 
ought to maximize the degree to which they dif- 
fered in attention to internal events as against inter- 
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est in external activity. From 157 students tested, 
10 high and 10 low scorers on both scales (5 men 
and 5 women in each group) were used in the signal- 
detection experiment. The high and low Ss thus 
represented extremes in self-report on scales of fre- 
quency of daydreaming and inclination to internal 
responsiveness. 


Procedure 


The Ss were placed in sound-attenuated, light- 
free cubicles to minimize response to external cues 
other than the signals. Randomly ordered 1-second 
pulses of pure tones of two highly discriminable 
frequencies were presented to Ss through earphones 
along with white masking noise. The Ss responded 
to signals by depressing a telegraph key and indi- 
cated the presence of task-irrelevant thought as re- 
quired by responding on a second switch. Signal 
presentation, interruptions, scoring, and recording of 
detection and reports of extraneous thought were 
automated. A task-irrelevant thought was carefully 
defined for Ss as any thought or image occurring 
during the immediately preceding trial which had as 
a referent some perceptual event that did not occur 
during the trial. Thoughts of a recent exchange with 
a member of the opposite sex, errands to be run in 
the future, or what the experimenter was doing 
in the next room were common examples of task- 
irrelevant thought. Thoughts about the accuracy of 
one’s performance were called task relevant if they 
referred only to one’s performance within that trial 
but task irrelevant if they referred to one’s perform- 
ance on a previous trial or over the task in general. 
Earlier studies in which content was specifically 
required in this situation indicated that such task- 
irrelevant thoughts included a great deal of highly 
personal fantasies, wishes, and plans. After each 15- 
second trial, the S signaled “yes” if at least one 
task-irrelevant thought occurred and “no” if no 
task-irrelevant thoughts occurred. A more refined 
scale was not employed because of the difficulty in 
defining a unit of thought. The Ss received 100 
trials, each consisting of 15 tones presented at a 
rate of 1 per second. The Ss were to respond only 
to the lower of the two easily discriminable tones. 
At the conclusion of the last trials, Ss recorded 
examples of the kinds of thoughts which they had 
rated task relevant or task irrelevant. 


RESULTS 


Inspection of Figure 1 indicates that the 
“high daydreamers” showed consistently more 
reports of task-irrelevant thought throughout 
the 100 trials; they reported task-irrelevant 
thoughts for 75.8 trials compared with a com- 
parable response on only 44.1 trials by the 
subjects low in predisposition to daydream- 
ing and thoughtfulness. The difference, sig- 
nificant at the .005 level, one-tailed test, 
supports the validity of the original criteria 
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by which subjects were selected. Day- 
dreamer introverts do indeed tend to have 
task-irrelevant thoughts while performing a 
signal-detection task. Both groups also 
showed statistically significant increases in 
reports of such spontaneous cognitive activity 
as the task proceeded. Comparison of the 
first 25 with the last 25 trials indicates an 
increase on the average of 4 reports for the 
high daydreamers (p < .01) and of 2.1 for 
the low daydreamers (p< .025), but the 
difference between the two groups in relative 
increase was not statistically significant. 

How does this greater tendency to process 
internally generated cognitive content on the 
part of the high daydreamers affect their 
detection performance? Figure 2 indicates an 
overall sizable difference in the detection accu- 
racy of the groups. Considering that the task 
was a relatively simple one and that the per- 
centage of correct detections for all subjects 
was 95.9%, the high daydreamers produced 
a mean of 83.3 incorrect responses out of 
1,500 signals compared with 40.4 for low 
daydreamers, a difference that is not, how- 
ever, statistically significant. The error rate 
of the high daydreamers was 5.55% of all 
signals, twice that of the low daydreamers 
(2.69%), but the difference was not statisti- 
cally significant. After using a square-root 
transformation to correct for a marked skew- 
ing of the data, an F of 2.08 (df = 1/18, ns) 
was obtained. Although the difference is not 
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statistically significant, the observed differ- 
ence is sufficiently large to caution against 
assuming that there is no real effect of 
daydreaming on detection errors. 

Inspection of Figure 2 indicates that the 
small difference in accuracy between the two 
groups at the outset increases sharply as the 
session progresses. Employing the decrement 
score proposed by Buckner and McGrath 
(1963)—the difference in correct responses 
between the first and last quarter of the 
100 trials—the mean decrement for high 
daydreamers was 13.1 (p < .01) while that 
for low daydreamers was —1.5 (ns). The 
difference in decrement between the two 
groups was significant at p < .025. It would 
appear, therefore, that somewhat different 
patterns of errors do occur for the high and 
low daydreamers, despite the lack of an over- 
all significant difference in detections for the 
two groups. 


DISCUSSION 


The results of the present study clearly 
indicate that predisposition to daydreaming 
and internal responsiveness carries over into 
performance in a relatively controlled signal- 
detection situation. The persons who indicate 
a considerable tendency in daily life towards 
involvement with their own thought processes 
continue to demonstrate such a predilection 
even when involved in a rapid, attention- 
demanding signal-detection task. Indeed, de- 
spite the usual lore about “absent-minded 


daydreamers” it is remarkable that high day- 
dreamers did not miss significantly more 
signals, on an overall basis, than the low day- 
dreamers. As a matter of fact, it would appear 
that under the stress of an alerting task of 
this type these youthful and apparently well- 
motivated subjects manifest evidence that 
humans have perhaps greater “channel 
capacity” than one would think. The Ss were 
able to process a considerable number of 
“task-irrelevant” thoughts while missing only 
a very few of the 1,500 signals. The indica- 
tion of some greater increase in errors over 
time for the high daydreamers may be the 
result of a relative preference for certain 
types of stimulation. At the outset of the 
signal-detection task, when the situation is 
quite novel for all Ss, both high and low 
daydreamers may show the same preference 
for the task over any other source of stimula- 
tion. Increasing exposure reduces its novelty. 
Indeed, it has been shown in an earlier study 
(Singer & Antrobus, 1963) that the Day- 
dreaming and Thoughtfulness scales are cor- 
related with measures of curiosity. It may 
be that with the reduced novelty of the 
signal-detection task the high daydreamer 
reduces the “payoff” he has assigned to 
processing the external signals. The high 
daydreamer has an alternative source of 
stimulation to which he turns with increasing 
frequency—his own thoughts and imagery. 
For the low daydreamer this option is less 
available, and the drastically limited external 
environment provides no alternative. Hence 
he sticks to the detections with maximum 
attention longer. 

The present results, while admittedly 
drawn from a relatively artificial situation 
(relevant chiefly to certain industrial, mili- 
tary, or aerospace situations), do indeed 
suggest that longstanding predispositions in 
attention to inner or external channels of 
experience are worth more extensive study 
and that self-report techniques can be useful 
in addition to reliance on projective methods. 
More subtle examinations of the relationships 
between fantasy predisposition and detection 
accuracy or responsiveness to one’s own 
products under different conditions of arousal 
or motivation are now underway. 
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ACTUARIAL, NAIVE CLINICAL, AND SOPHISTICATED 
CLINICAL PREDICTION OF PATHOLOGY FROM 
FIGURE DRAWINGS 


GEORGE STRICKER + 
Adelphi University 


87 pairs of figure drawings, 37 by normals and 50 by psychiatric patients, 
were given to 22 clinical students and 6 experienced clinicians, along with a 
complete account of the pertinent research findings of Hiler and Nesvig. Ss 
were asked to judge whether the drawings were normal or pathological. The 
students were superior to the clinicians in the task, and 23% of them were 
superior to an actuarial formula that had been constructed previously. Data 
were interpreted to support the relative superiority of sophisticated over 
naive clinical prediction and the possibility that in some circumstances it may 


be superior to actuarial formulas. 


Hiler and Nesvig (1965), after collecting a 
large number of figure drawings from groups 
of normal and psychiatric adolescents, devel- 
oped a predictor formula which, when applied 
by each of three students, correctly classified 
78%, 78%, and 82% of the sample. Experi- 
enced clinicians and nonpsychologists, mak- 
ing judgments about a similar set of draw- 
ings, were only able to classify 64% and 65% 
of the drawings correctly, and no single indi- 
vidual was able to approach within five per- 
centage points of the accuracy of the least 
effective student using the formula. The au- 
thors attributed the clear superiority of the 
formula to its success in eliminating a number 
of inappropriate cues in the judgmental proc- 
ess, with subsequent reliance on only the most 
valid indicators. Since three of the four indi- 
cators (definitely bizarre; major part omitted; 
happy, pleasant facial expression; nothing 
pathological) used in the formula call for some 
measure of clinical judgment,? the authors did 
not recommend the elimination of clinical 
judgment, but rather, the channeling of such 
judgment into directions indicated by empiri- 
cal findings. 


1 The author would like to thank Arnold Projansky 
for his help in data collection, and E. Wesley Hiler, 
who very kindly lent the figure drawings which he 
had used in a previous study and upon which this 
research is based. 

Tt is the retention of some reliance on clinical 
judgment in the formula which produced the differ- 
ences in the amount of success of the students with 
the formula. If the formula was entirely objective, 
the students should all have achieved the same level 
of success. 


This recommendation is reminiscent of a 
distinction drawn by Holt (1958) between 
naive and sophisticated clinical prediction. 
Naive clinical prediction relies upon intuitive 
and subjective means of selecting and inte- 
grating cues in arriving at a judgment. So- 
phisticated clinical prediction uses available 
empirical data along with subjectivity, so that 
the final decision, arrived at through a clinical 
rather than actuarial combinatorial procedure, 
has been influenced by the same type of 
objective empirical information available in 
the construction of an actuarial formula. 

The study by Hiler and Nesvig (1965) 
clearly pitted actuarial prediction against 
naive clinical prediction, with the results in 
favor of the formula; this is consistent with 
many previous studies of this design (Meehl, 
1954). The purpose of the present study is to 
make the Hiler and Nesvig findings available 
to a group of clinicians and then ask them to 
use whatever procedure they choose in making 
judgments about the pathology of the sub- 
jects who drew the figures. This provides a 
more clear approximation of sophisticated 
clinical judgment and may cast some light 
upon the relative efficacy of the three ap- 
proaches. 


PROCEDURE 
Subjects 


Three groups of Ss representing different levels of 
experience with psychodiagnostic techniques were 
employed. The most experienced group consisted of 
PhD clinicians (V=6), whose range of postdoc- 
toral experience was 5-22 years, with a mean and 
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median of 14 years. Each of them either held a 
full-time clinical position, was engaged in private 
practice which included psychodiagnostic services, or 
both. The middle group was the entire third-year 
class (W =10) of students in the Adelphi clinical 
training program. These students were a few weeks 
away from completing all their academic require- 
ments, but had not begun a clinical internship. They 
had completed all of their instruction in psychodiag- 
nostics and had had practicum experience with fig- 
ure drawings of both normal and psychiatric popu- 
lations. The least experienced group was the entire 
first-year class (N = 12) of students in the Adelphi 
clinical training program. These students had some 
formal academic training in the use of figure draw- 
ings and some practicum experience using figure 
drawings in the public school system, but they had 
not begun the practicum course in psychodiagnostics 
with psychiatric patients. Each of the students had 
taken at least a portion of his course work in psy- 
chodiagnostics with one of the PhD clinicians, and 
most of the third-year students had been supervised 
in the practicum by at least one, and sometimes two, 
of the clinician judges. 


Stimulus Materials 


Eighty-seven pairs of figure drawings, 37 by nor- 
mals and 50 by psychiatric patients, were used. These 
were the same drawings which were classified by 
the predictor formula in the Hiler and Nesvig (1965) 
study. Additional information about the people who 
performed the drawings is contained in the instruc- 
tions below; a complete description of the drawings 
and an account of the method of collecting them 
can be found in Hiler and Nesvig (1965). 


Method 


Each S was seen individually and given the fol- 
lowing written instructions: 3 


All of the drawings you will be shown were 
drawn by adolescents between the ages of 13 and 
16, in response to instructions to “Draw a whole 
Person. Do the best job you can,” and after com- 
Pletion of the first figure, “Draw a person of the 
Opposite sex.” Some of them were patients at a 
State Hospital (about half were diagnosed “psy- 
choneurotic” and the rest were spread out among 
“organic,” “schizophrenic,” and “character dis- 
order”), and some were normal children, function- 
ing successfully in their community. Your task 
will be to look at each set of drawings, and then 
make a judgment as to whether it was the prod- 
uct of a “patient” or a “normal.” 


There has been considerable research done with 
the drawings of children from precisely this popu- 


® These instructions summarized the substantive 
findings of the Hiler and Nesvig study and included 
information revealed by the study but not included 
in the predictor formula. This leads to a discrepancy 
between the number of indicators used in the form- 
ula and the number of findings presented to the 
judges. 
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lation, and a number of findings have emerged 
which may help you with your judgment. These 
findings have been developed relevant to the task 
at hand, and while they may or may not have 
wider generality they have been found significant 
in making this decision for these groups. 

The major characteristics which were present in 
the drawings of the patients were: 


1. Bizarreness—This category includes such im- 
pressions as “schizy,” “grotesque,” “inhuman,” 
“sinister,” “sick,” “ghoulish,” “weird,” and “gnome- 
like,” but not simply “peculiar” or “distorted,” 

2. Omission of major parts of the body—The 
omission of major parts of the body, such as head, 
body, arms, legs, hands, feet, eyes, nose, mouth 
and hair, and particularly of arms, hands and 
torso, was more characteristic of “patients” than 
“normals.” 

3. Distortions—This category was particularly 
effective if distortion of the head or arms was 
present. 

4. Transparencies—This category referred par- 
ticularly to transparency of the body or legs 
through the clothing. 

The major characteristics which were present in 
the drawings of the normals were: 


1. Happy, pleasant facial expression 


2. Nothing pathological—The subjective impres- 
sion that there was nothing pathological in a 
drawing was much more common in the drawings 
of “normals.” 

A number of cues which may be of some clini- 
cal use in other tasks, or for making other judg- 
ments, proved to be of no value in discriminating 
between these two groups. These cues include: 

1. Conflict and anxiety indicators—These include 
line emphasis, erasures, sketchiness, excessive shad- 
ing and a hostile or anxious appearance. These 
were not present in either group with any greater 
frequency. 

2. Size and pressure—Neither size nor lightness 
of pressure discriminated significantly, 

3. Absence of clothing 

4. Proportion between body parts 


5. Motion and posture 


The judge was allowed to retain the instructions, to 
refer to them as often as he chose, and to make a 
judgment of “normal” or “pathological” for each set 
of drawings on whatever basis he wished. At the 
time of administration the examiner was not aware 
of the correct classification of any of the drawings, 
reducing the possibility of experimenter bias affecting 
Ss’ judgments. 


RESULTS AND Discussion 


In classifying drawings as either normal or 
pathological, the clinicians were correct in 
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66% of their judgments, the first-year stu- 
dents in 72% of their judgments, and third- 
year students in 73% of their judgments. This 
can be compared to a base rate of 57% if all 
drawings were classified as pathological. 

‘A simple randomized analysis of variance 
comparing the three groups did not quite 
reach the usual standard for statistical sig- 
nificance (F = 3.23, df=2/25, 05<p< 
-10). When the two student groups were com- 
bined and compared with the clinicians, sig- 
nificance was reached (F = 6.10, df = 1/26, 
Ż < .05), It seems reasonable to conclude that 
the students were more effective in their abil- 
ity to discriminate normal from pathological 
figure drawings than were experienced clini- 
cians. 

In the original Hiler and Nesvig study 
(1965), the psychologists were correct in 
64% and the nonpsychologists in 65% of 
their judgments. These figures are quite close 
to the 66% standard achieved by the clini- 
cians in the present study. The predictor 
formula in the Hiler and Nesvig study was 
applied by three students who were correct 
78%, 78%, and 82% of the time. This aver- 
age is higher than that achieved by students 
in the present study. However, 23% of the 
present students were able to do better than 
the modal 78% accuracy of the formula, and 
55% were able to do better than the 73% 
accuracy achieved by the best psychologist 
(who used naive clinical judgment) in the 
Hiler and Nesvig study. 

The superiority of the students to their 
teachers and to the psychologists in the previ- 
ous study seems to provide testimony for 
the superiority of sophisticated over naive 
clinical judgment in Holt’s sense of the terms. 
The psychologists in the previous study did 
not have any empirical data available to them, 
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and the clinicians in this study seemed quite 
reluctant to abandon a diagnostic approach 
which they had used for many years, so that 
the data seemed to have minimal impact on 
their judgments. The students, who were not 
fully committed to any specific diagnostic ap- 
proach, were quite willing to use the data 
and to integrate research findings into their 
decision making. The approach of the stu- 
dents was clearly more effective in this situa- 
tion. 

A question may be raised as to whether the 
students, with all the research available to 
them, might be doing little more than apply- 
ing an approximation of the prediction form- 
ula and doing so inefficiently. The fact that 
almost one-quarter of them were able to ex- 
ceed the modal accuracy of the formula seems 
to indicate that at least some of them were 
contributing something other than error vari- 
ance to the decision-making process. 

The relative efficacy of actuarial and clini- 
cal judgment cannot be decided by one iso- 
lated study. These results are limited to one 
diagnostic instrument, one rather simple dis- 
crimination, and one set of patients. How- 
ever, they do suggest the relative superiority 
of sophisticated to naive clinical judgment 
and raise the possibility that clinical judg- 
ment, as practiced by some clinicians, can be 
more accurate than an actuarial formula. 
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MEASUREMENT OF SOCIAL COMPETENCE IN 
COLLEGE MALES? 


RICHARD I. LANYON 
Rutgers—The State University 


This article describes the construction and validation of an instrument to 
assess social competence in college males. Items were written asking for reports 
of verifiable behavior or biographical information which reflected social 
participation, interpersonal competence, achievement, and environmental 
mastery. After preliminary clarification and adjustment of the items, those 20 
items which withstood an internal consistency analysis were labeled the B-III 
scale. The scale was validated by demonstrating that high and low B-III 
scorers differed in the anticipated manner on the MMPI. In addition, socially 
competent fraternity members scored higher than socially incompetent members, 


who in turn scored higher than a classroom normative sample. 


Recent views of mental health (e.g., Jahoda, 
1955; Lazarus, 1961, pp. 21-23) have tended 
to discard the traditional “avoidance of 
stress” approach in favor of more positive 
concepts such as social adequacy, interper- 
sonal competence, achievement, and mastery 
of one’s environment, One advantage of the 
newer approach is that the concepts used are 
fairly operational and thus amenable to meas- 
urement. Several instruments to define and 
measure these concepts have been constructed 
—for example, Doll’s (1953) Vineland Social 
Maturity Scale, for the assessment of re- 
tardation, and the Worcester Scale of Social 
Attainment (Phillips & Cowitz, 1953), for 
assessing social competence in psychiatric pa- 
tients. The competence or achievement 
approach to mental health has the further ad- 
vantage of emphasizing the use of biographi- 
cal and behavioral data in psychological de- 
scription and prediction. There are indications 
(e.g., Fulkerson & Barry, 1961; Little & 
Shneidman, 1959) that such an approach to 
measurement is more valid in many cases than 
the use of traditional psychological testing 
procedures. 

The present paper describes the construc- 
tion of a scale to measure social competence 
in college males, using only biographical and 
behavioral data which involve verifiable state- 
ments of fact. Subjects were considered so- 


1This study was supported in part by a grant 
from the Rutgers Research Council. Thanks are 
extended to Carol Hamilton and Richard Knoblauch 
for their assistance. 


cially competent to the extent that their 
backgrounds and/or present lives showed 
behaviors or characteristics which indicated 
social participation, interpersonal competence, 
achievement, and environmental mastery. The 
following list of characteristics was drawn up 
as a working definition of social competence: 


1. History of frequent and positive social interaction 
with both sexes. 

2. Participation in organizing and directing group 
activities. 

3. Better than average academic 
achievement. 

4. Acceptance of authority, ability to discipline one- 
self, and no history of legal difficulties. 

5. An unbroken and secure family background, but 
with definite indications that personal freedom 
and responsibility have been encouraged. 

6. Participation in athletic activities. 

7. Some participation in socially desirable adult 
behaviors such as church attendance, drinking, 
and interest in world affairs, 


The list describes a kind of cultural-ideal 
stereotype for a college student. It suggests a 
socially sophisticated, responsible, outgoing, 
friendly, and somewhat aggressive young man 
who was reared by loving yet wise parents 
and who hopes to become a respected and in- 
fluential member of society. 


interest and 


METHOD 


A set of 46 items was written to represent the 
above characteristics. An effort was made to have 
each topic represented to the extent of its intuitively 
judged relevance to social competence. Some of the 
items were written in multiple-choice form, while 
the remainder required a numerical response. This 
46-item preliminary form of the Biographical Sur- 
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vey (B-I) was administered to 45 males in an under- 
graduate psychology class. As a result of their 
answers and comments, ambiguities in the questions 
were removed and multiple-choice foils were ad- 
justed appropriately. Married students and those 
over 20 years old were excluded from this sample 
and from the two following samples. 

The revised form (B-II) was administered to 
135 introductory psychology students, and their re- 
sponses were analyzed in the following manner. 
Frequency distributions were made of the responses 
to each item. Two judges (the author and a senior 
graduate student in clinical psychology with 2 years 
of college counseling experience) independently des- 
ignated the direction and manner of scoring each 
item for social competence. Agreement was reached 
on 36 items, and the remainder were discarded. The 
cutoff point in the response distribution of each 
item was placed at what was considered the border- 
line of social competence. Such a procedure re- 
sulted in 70-80% of the responses to each item 
being defined as competent. One point was assigned 
for each answer in the direction of social compe- 
tence. Thus, the maximum possible score was 36. 

An internal consistency analysis was next carried 
out, Point-biserial correlation coefficients were de- 
termined between each item and the total score. The 
20 items which were found to correlate with the 
total score beyond the 5% level in the predicted 
direction were retained for the final scale. 

The 20-item final scale (B-III) was administered 
to a new sample of 195 introductory psychology 
students, The mean score was 14.9, and the standard 
deviation was 2.6. The 2nd, 10th, 50th, and 90th 
percentiles were 8.4, 11.5, 15.2, and 18.0, respectively. 

The items and key are given in Table 1. The con- 
tent is a fairly even representation of the initial list 
of characteristics (and thus of the initial 46 items), 
except that the items about dating are perhaps more 
heavily represented in the final form. Thus, if it can 
be assumed that the Ss were honest in their re- 
sponses, the high scorers corresponded essentially to 
the description given above—active and energetic, 
decisive, socially extroverted, well emancipated from 
parental control, and with a sense of basic social 
conformity. 


Validation 


The generality of the assumed differences between 
high and low scorers on the B-III was examined 
using group MMPI scores. The Ss in the normative 
group who scored more than one standard deviation 
above the mean were contrasted on the 13 usual 
MMPI scales with Ss scoring more than one stand- 
ard deviation below the mean. It was predicted that 
compared with low B-III scorers, the high scorers’ 
MMPIs should indicate greater interpersonal compe- 
tence and social involvement (lower Si score), less 
anxiety and indecision (lower Pt score), and more 
conforming thought patterns (lower Sc and F scores). 

These predictions were generally supported. Of 
the 13 comparisons, 4 were significant beyond the 
.05 level. There was a mean difference of 13 T scores 


on the Si scale ($ < .001), 10 on the D scale (p< 
-001), 8 on the Pt scale (p < .01), and 5 on the F 
scale (p < .02).? The difference of 5 T scores on the 
Sc scale failed to reach significance at the .10 level. 
In each case, higher B-III scores were associated 
with lower MMPI scores. 

The greatest differences in the opposite direction 
were four T scores on the Ma scale, two on the Hy 
scale, and two on the Pd scale, none of which reached 
significance at the .05 level. 

Further validation was carried out with the as- 
sistance of a local fraternity (V=43). These stu- 
dents were asked to fill out the B-III and then to 
list the names of the five members with whom they 
would most prefer to double-date and the five with 
whom they would least prefer to do so. In this 
manner it was hoped to identify the most and least 
socially competent members of the fraternity. Strict- 
est confidence was assured, and all members returned 
their responses in a sealed envelope. The Ss who re- 
ceived five or more “most prefer” votes were con- 
sidered to be most competent (N = 17), while those 
who received five or more “least prefer” votes were 
considered least competent (N = 10). The mean B-III 
scores of these two groups were 16.94 and 15.80, 
respectively. They differed from each other beyond 
the .05 level and were both greater (each p< .05) 
than the mean of the normative sample (14.91). 
Thus the validity of the B-III was again supported. 


Discussion 


The present scale was constructed from the 
responses of male college freshmen and sopho- 
mores, a sample highly restricted in age, edu- 
cation, intelligence, and, to some extent, socio- 
economic level. The restricted nature of the 
sample permitted the use of some items which 
would be inappropriate for other populations 
and reduced the variance due to the factors 
listed above. On the other hand, the scale is 
clearly unsuitable for use with anybody but 
college males. 

What do the B-III items and the validity 
evidence suggest about social competence 
among college males, as measured in the pres- 
ent manner? Emphasis appears to be placed 
on extroversion, activity, and decision mak- 
ing at the expense of thinking and reflection. 
It is noteworthy that the fraternity members 
who were rated least competent nevertheless 
scored higher than the classroom sample. It 
can be concluded that the B-III emphasizes 
the somewhat superficial or salesmanlike as- 
pects of competence at the expense of the 
introspective aspects. 


2 Two-tailed significance tests were used. 
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TABLE 1 
BroGRaAPHicaL Survey II 


Dun eNA 


10. 


11, 


12. 


13. 


14. 


15. 
16. 
17. 
18. 


19, 
20. 


. How old was your mother when your parents married? 18-26 
. How many different girls did you date up to the end of your senior year in high school? 2 and more 
. What is the total number of dates you had during your senior year in high school? 10 and more 


How frequently do you date at present? more often than once a month 


. How old were you when you began to date regularly? up to‘and’including 17 
. How many serious physical illnesses have you had during your lifetime (those that have incapacitated you for 


2 weeks or more)? Oor 1 


. If you felt you were getting some illness while here at school, what would you do? 


Try to ignore it as long as possible 
Consider going home to your parents’ place 
Go home to your parents’ place 
x _ Go directly to Student Health or another doctor in town 


Other (Specify) 


. Have you ever made a trip as much as 200 miles away from home (without your parents or other guardian) 


where you stayed overnight, other than visiting relatives? yes (yes or no) 


. Have you ever made such a trip as much as 1,000 miles away from home without a parent or other guardian? 


ys (yes or no) 
How do you approach your school assignments? 
x _ Get them done ahead of schedule 
x Do them in the last few days, but always get them in on time 
Rush them at the last minute, and sometimes get them in late 
Have habitual problems with getting them done on time, in spite of adequate ability 
How many times, in your lifetime, have you been spoken to by a policeman for any possible trafic offense, 
except parking? 0-3 
Do you cook your own meals? 
never 
x rarely; perhaps an occasional piece of toast 
x sometimes; it is not unusual for me to prepare my own meal 
x frequently; I do this as often as not, when I have the opportunity 
Who usually buys (i.e., selects) your clothes? 
x Ido 
my mother does (or similar person) 
sometimes I do; sometimes my mother does 
With how many social, recreational, or organizational activities were you affiliated during your last year in 
high school? 2 and more 
Of the activities in No. 14 above, in how many of these (if any) did you hold an office? _1 and more 
With how many social, recreational, or organizational activities are you affiliated now? 1 and more 
Do you drink at all now? yes_ (yes or no) 
Do you participate frequently and regularly (once a week or oftener) in some nonorganized athletic activity 
(e.g., play handball with Joe on Thursdays)? _yes_ (yes or no) 
How often do you go to chruch?] any response other than‘“never” 
How much freedom do you have (when home) with an automobile? 
x Ihave my own car 
x I can always (or nearly always) get the car from my parents 
x Ican sometimes get the car 
I can occasionally get the car 
____I don’t drive 


Note.—Criterion answers are checked or inserted. No more than 1 point is scored for any one question. 
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This paper does not answer the question of 
whether the kind of social competence meas- 
ured here is healthy in the traditional sense 
of mental health. It has simply outlined a 
consistent set of behaviors and personality 
characteristics which were judged to bring 
social approval to their possessors. Whether 
such people will be regarded as possessing a 
high degree of traditional mental health in 
the long run is a question for further research. 
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OLD WINE IN NEW SKINS: 
GROUPING WECHSLER SUBTESTS INTO NEW SCALES? 


AUKE TELLEGEN anp PETER F. BRIGGS 


University of Minnesota 


This paper is a reaction to the extensive literature on new Wechsler subtest 
combinations (e.g., short forms and factor scales). Any new Wechsler composite 
raises elementary issues of reliability, validity, and score standardization. It is 
argued that in the past, these issues have not been dealt with in a satisfactory 
manner. Reliability and validity formulas, appropriate for Wechsler composites, 
are presented. A modified part-whole correlation formula is offered for use in 
cases of nonindependent test administration of part and whole. In order to 
facilitate appropriate clinical use, conversion tables are presented for the 
transformation of composite scores—including factor scores—into Wechsler 


Deviation Quotients. 


The wide use of the Wechsler intelligence 
scales has led psychologists to explore the 
value of subtest combinations other than the 
original Verbal, Performance, and Full Scale. 
Factor-analytic studies, for example, shed new 
light on the organization of the subtests, in- 
dicating the existence of approximately three 
correlated but distinct “primary” aptitudes, in 
addition to a higher order “general intelli- 
gence” factor (e.g., Guertin, Rabin, Frank, & 
Ladd, 1962). Accordingly, the suggestion has 
been made to replace the a priori Verbal and 
Performance IQs by factor scores obtained 
from appropriate subtest combinations (e.g., 
Cohen, 1959). 

A different line of studies has been con- 
cerned with so-called “short forms”: abbrevi- 
ated scales intended to yield close equivalents 
of the Full Scale IQ. Although originating in 
a different context, one may view short forms 
factor analytically as abbreviated “General 
Factor” measures (e.g., Nickols, 1962). 

New subtest combinations, of course, have 
also been proposed outside the factor-analytic 
and short-form literature (cf. Guertin et al., 
1962). No matter how a particular new sub- 
test combination arises, its suggested use raises 
certain elementary issues. Just as in the case 
of an entirely new test, the following ques- 
tions must certainly be considered: What is 
the reliability of the proposed subtest combi- 
nation? What is its correlation with meaning- 
ful criteria? How are test results to be con- 

1 Acknowledgment is made of support in part by 
the Vocational Rehabilitation Administration and 
Training Grant Number 2. 


verted into standardized values? Inspection 
of relevant literature led the authors to con- 
clude that quite generally, the above questions 
have either been neglected or have been dealt 
with in inappropriate and unproductive ways. 

Factor-analytic studies, for instance, have 
primarily been concerned with isolating func- 
tional unities rather than with such problems 
of application as transforming factor scores 
into uniform and familiar Wechsler-type quo- 
tients. The short-form literature, while em- 
phasizing application, shows methodological 
shortcomings: It neglects reliability, ap- 
proaches evaluation of validity by means of a 
misleading index (the part-whole correlation), 
and uses unsound methods for the conversion 
of scores into IQs. 

One purpose of this paper is to document 
the comments just made and to suggest better 
ways of evaluating new subtest combinations 
through the use of appropriate formulas. A 
further objective is to encourage and facili- 
tate clinical use of new combinations, includ- 
ing those which have been suggested by fac- 
tor-analytic research. Tables will be presented 
for the rapid conversion of scores obtained on 
such new combinations into Wechsler Devia- 
tion Quotients. 

In using the presented formulas and tables 
it is important to keep in mind that they apply 
only when the subtests included in a subtest 
combination have equal weights and variances. 
The assignment of equal weights implies that 
the total scores on a given subtest combina- 
tion or “composite” are obtained by simple 
summation of the scores on the component 
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subtests. The assumption of equal variances 
requires that prior to summation the subtest 
raw scores be transformed into scaled scores 
as determined separately for the appropriate 
age group. In the case of the WISC, this is 
standard procedure (Wechsler, 1949). In the 
case of the WAIS, one must be sure to use the 
optional tables in the appendix of the manual 
(Wechsler, 1955, pp. 101-110). The manuals 
of the Wechsler-Bellevue I and II do not con- 
tain the needed information. 

The three issues—reliability, validity, and 
score standardization—will be discussed in 
order. 


RELIABILITY 

Full and proper use of a new subtest com- 
bination requires information concerning its 
reliability, yet such information is rarely, if 
ever, provided in the case of new Wechsler 
subtest groupings. The reliability of a com- 
posite is actually readily obtained from the 
reliabilities and intercorrelations of the com- 
ponent tests. The general formula has been 
derived by Mosier (Guilford, 1954, p. 393). 
When the subtests have equal weights and 
equal variances, the formula may be simpli- 
fied as follows: 


Bra + 22rji 
Ae TEER [1] 


where fe = reliability coefficient of the sub- 
test combination or composite C, rj; = relia- 
bility coefficient of any component subtest j, 
rj: = correlation between any component sub- 
tests j and & (where subscript k is numerically 
larger than subscript j), and n = number of 
component subtests. 

An illustration may be useful. Suppose we 
want to determine the reliability of the well 
known “Doppelt” short form of the WAIS 
(Doppelt, 1956). This abbreviation consists 
of the Arithmetic, Vocabulary, Block Design, 
and Picture Arrangement subtests. Limiting 
ourselves to the 25-34-year-old group, we find 
(from p. 13 of the manual) that rj, the sum 
of the reliabilities of the four component sub- 
tests, equals 3.19. Turning to page 16 we find 
that 37x, the sum of the six intercorrelations 
between the components, equals 3.32, so that 
23rix = 6.64. With n = 4, we arrive at fe = 
924. 

The danger of ignoring reliability is illus- 
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trated by a different type of Wechsler short 
form, one obtained by reducing the length of 
the subtests instead of their number (Satz & 
Mogel, 1962; Yudin, 1966). The advantage 
of this type of abbreviation seemed to lie in 
preserving the possibility of evaluating the 
complete subtest profile. The problem of re- 
liability, however, was not considered; Zytow- 
ski and Hudson (1965) seem to have been 
alone in pointing out its relevance in this 
context. 

Consider Yudin’s (1966) recently proposed 
WISC abbreviation. Consulting the informa- 
tion on the 10.5-year-old WISC normative 
group (Wechsler, 1949, p. 13), we find that 
reliability coefficients for the full-length sub- 
tests range between .59 and .91, with five co- 
efficients of .80 or more. Yudin’s subtest ab- 
breviations, on the other hand, when estimated 
from the same data by means of the Spear- 
man-Brown formula, are found to range be- 
tween .39 and .77, with five coefficients below 
.60. Remembering that differences between 
test scores tend to be much less reliable than 
the tests themselves (e.g., McNemar, 1957), 
one has to conclude that abbreviated subtests 
will yield highly unreliable profile data (even 
if total scores were sufficiently reliable), An 
additional difficulty of this type of abbrevia- 
tion may be mentioned, namely, its use of the 
so-called prorating procedure, which will be 
discussed later. 

The present authors share Pauker’s (1963) 
and Zytowski and Hudson’s (1965) reserva- 
tions regarding this type of short form. On 
the basis of available evidence and the above 
considerations, however, we would like to be 
even more emphatic in declaring this format 
ill-suited for precisely that function for 
which it was designed—profile analysis. If 
abbreviation is necessary, then the cause of 
meaningful profile study is better served by 
the use of a reduced number of sensibly 
chosen full-length subtests. Even more satis- 
factory, from the viewpoint of the reliability 
and meaning of the profile, would be the use 
of a set of factor-analytic composites, dis- 
cussed in a later section. 


VALIDITY 
Questions will arise concerning the correla- 
tion between a new subtest combination and 
relevant criterion variables. This correlation 
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may be computed from the criterion’s corre- 
lations with the component subtests and the 
intercorrelations of the latter. The following 
formula, a simplification of the equation for 
the correlation of sums, is applicable if we 
assign equal weights and variances to the 
component subtests: 

a [23 

n+ 22 rin 

where ry = correlation between composite C 
and criterion FY, and rj, = correlation between 
any component subtest j and criterion y, with 
n and rj defined as in Formula 1. 

As an example, we may determine the rela- 
tionship between the Doppelt short form and 
the WAIS “General Intelligence” factor as de- 
termined by Cohen (1957b). For the four 
component subtests Cohen reports on the 
25-34-year-old group the following general 
factor loadings (i.e., correlations with the 
general factor): .71, .79, .71, and .69. Sub- 
stituting in Formula 2 the sum of the above 
values (2.90), and with m equal to 4 and 
235r;x equal to 6.64, we obtain roy = .89. 

It seems that authors and evaluators of 
short forms have been particularly concerned 
with the validity of new Wechsler composites 
as expressed by the so-called part-whole cor- 
relation between short form and Full Scale. 
The literature concerned with that index is, 
indeed, extensive. 

The reason for interest in Wechsler part- 
whole correlations is understandable enough: 
They are to inform us of the degree of equiv- 
alence between short forms and Full Scale. 
The difficulty with the usually reported part- 
whole correlation is that we cannot interpret 
it in the way we have learned to interpret 
other correlations. Ordinarily a nonzero cor- 
relation between two psychological variables is 
attributed solely to a relationship between the 
“true scores” on the two measures and, by the 
same token, is viewed as wholly interpretable 
in terms of the “meaning” of the two tests. 
This view is based on the traditional assump- 
tion that the “measurement errors” occurring 
on the two tests are independent and, there- 
fore, do not contribute to the observed corre- 
lation, instead only attenuate it (the well 
known “correction for attenuation” is of course 
Meant to eliminate the effects of these inde- 
Pendent errors). 
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In the case of the part-whole correlation, 
the assumption of error independence is vio- 
lated whenever the part score is obtained 
from the same test administration as the 
score on the full-length scale. In that case the 
part-whole correlation between a Wechsler 
short form and the Full Scale, for example, is 
boosted by the obviously perfect correlation 
between the error of the short form and that 
same error as a constituent of the Full Scale 
score. The correlated error turns the part- 
whole correlation into an ambiguous index, 
not one that would be “wrong” in a computa- 
tional sense, but one that would lead to wrong 
interpretations if treated as a correlation of 
the usual type. A part-whole correlation of 
.95, obtained in the way just described, does 
not signify the same kind of relationship as 
a correlation of the same magnitude between, 
say, two different forms of the same test. 

Identifying the difficulty practically 
amounts to solving it. It becomes evident that 
in cases of nonindependent test administration 
we need to replace the usual part-whole cor- 
relation by a modified index in which the 
spuriously perfect correlation between the 
part and itself is replaced by its reliability 
coefficient. Such an index will tell us what 
the relationship is between part and whole if 
their errors are kept independent. For the 
case in which the subtests have equal weights 
and variances, we may write the following ex- 
pression for the modified part-whole correla- 
tion: 

Pa 22r; 
e Nin E Era Vi + Erim 
where 7'pw = modified coefficient of correla- 
tion between the composite part and the com- 
posite whole; rp = correlation between any 
subtest j included in the part and any subtest 
L included in the whole, where any included 
correlation between a subtest and itself is 
represented by its reliability coefficient; ry, 
= correlation between any subtests j and k 
belonging to the part (where subscript k is 
numerically larger than subscript j); rim = 
correlation between any subtests 7 and m be- 
longing to the whole (where subscript m is 
numerically larger than subscript 1); n= 
number of subtests included in the part, and 
¢ = number of subtests included in the whole. 
33r; may be obtained by totaling the fol- 
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lowing three sums: (a) the sum of the relia- 
bilities of the subtests comprising the part, 
(b) twice the sum of the intercorrelations 
among the subtests comprising the part 
(23r;x), (c) the sum of the intercorrelations 
between any subtest included in the part and 
any subtest not included. The use of odd- 
even reliabilities may achieve for the shared 
components of part and whole about the same 
degree of error independence as exists between 
part and remainder obtained during the same 
test administration. 

As illustration, we may determine 7'pw for 
the Doppelt short form referred to earlier. We 
would expect this correlation to be lower than 
the regular 7p, which Doppelt (1956) reports 
to be .954 for the 25-34-year-old age group. 
For this age group we find 33rp, computed 
following the indicated steps, to be 25.21. 
Furthermore: S79, = 3.32; 3rim (obtained by 
summing all subtest intercorrelations on p. 16 
of the manual) = 29.27; 4n = 2; 4#=5.5. 
Substituting these values in Formula 3 we 
find 7’ = .927. One could correct this corre- 
lation for attenuation, first using Formula 1 
for reliability computation, then dividing .927 
by the geometric mean of the reliabilities of 
part and whole (or by the square root of the 
Full Scale reliability if one wishes to esti- 
mate the correlation between the short form 
and the true Full Scale score). 

Tf we were to use the present procedure 
(with or without attenuation correction) to 
rank various subtest combinations according 
to their correlation with the Full Scale, we 
would probably obtain results that deviate 
from earlier published orderings that used the 
regular part-whole index (Clements, 1965; 
Enburg, Rowley, & Stone, 1961; Howard, 
1959; Jones, 1962; Maxwell, 1957; McNe- 
mar, 1950). 

Formula 3, or an equivalent, apparently 
has not been proposed before.? In correcting 
for error contamination, it provides meaning- 
ful information concerning the relationship 
between part and whole. It is hard to see, 
actually, how without this correction the 
part-whole correlation—in the case of non- 


2 However, in the case where n= 1, Formula 3 
may be simplified and then becomes the equation 
proposed by Cureton for the correlation between a 
single test item and the total test score (Cureton, 
1966, Equation 5). 
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independent test administration—can be use- 
ful in other than rather special circumstances. 
The part-whole correlation has been criti- 
cized before in connection with Wechsler short 
forms. One criticism calls attention to the 
misclassifications that result from the use of a 
short form when the Full Scale classification 
is treated as criterion (Mumpower, 1964). As 
the criterion classes become narrower and 
More numerous, the number of misclassifica- 
tions increases and may become quite large, 
even if the part-whole correlation is quite 
high (Silverstein, 1965). This particular ob- 
servation, however, is no more relevant to 
part-whole correlations than it is to other 
predictor-criterion relationships, As a criti- 
cism it is possibly misleading, in that it em- 
phasizes the increase in number of errors as 
classes become narrower without pointing out 
the concomitant decrease in their average 
magnitude (importance). Also, the standard 
of perfect criterion prediction, implied in the 
criticism, ignores the fact that even criteria 
are flawed by unpredictable error variance. 
Zytowski and Hudson (1965) question 
part-whole correlations more along lines fol- 
lowed in the present paper. These authors, 
however, treat the contribution of the part’s 
true score to the Full Scale score as no more 
desirable than the contribution of its error: 
They eliminate all contribution of the part to 
the whole, selecting the correlation between 
part and remainder (McNemar, 1962, p. 164) 
as the appropriate index. Here, of course, one 
no longer has an expression of the relationship 
between part and whole, We would maintain 
that pw is to be preferred, because in elimi- 
nating error overlap, but not true-score over- 
lap, it spares the baby the fate of bath water. 
The regular part-whole correlation remains 
appropriate when part and whole have been 
obtained independently (in different test ad- 
ministrations). At least one Wechsler short- 
form study (Mogel & Satz, 1963) presents 
correlations for data obtained in this fashion. 
Even an appropriate part-whole correlation 
does not eliminate the need for reliability 
data. Consider once more the short-form vari- 
ant discussed in the preceding section which 
uses abbreviated subtests. The proponents of 
this format were impressed by the high part- 
whole correlations between the abbreviated 
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and full-length subtests (computed in the con- 
ventional way). Even when corrected, the 
part-whole correlations would almost cer- 
tainly have been higher than the abbreviated 
subtests’ reliabilities: The correlation between 
a test and a more reliable measure of the 
same function is higher than the test’s own 
reliability. The latter, however, is what we 
need to know in order to determine profile 
reliability—the critical issue in this instance. 


Score STANDARDIZATION 

In order to make practical use of a novel 
subtest combination one must find a way of 
appraising the magnitude of the obtained 
total score. Here one faces the problem of 
converting composite scores into standardized 
quantities, Two conversion methods have be- 
come prominent in the Wechsler literature: 
“prorating” and the use of regression equa- 
tions. Both methods originated in connection 
with the use of short forms. 

The advantage of prorating is that it al- 
lows one to use the IQ conversion tables of 
the Wechsler manuals (cf. Wechsler, 1955, p. 
31). The method, however, assumes that the 
subject would have obtained on the omitted 
subtests an average score equal to his average 
score on those that were administered. The 
assumption is, of course, incorrect in varying 
degrees, The result is that prorating, while 
setting the mean IQ of the normative group 
correctly at 100, tends to inflate the norma- 
tive standard deviation; in other words, it 
tends to generate IQ values that are too ex- 
treme, This effect may be reflected in some of 
the reported standard deviations of prorated 
IQs as compared to Full Scale IQs (eg., 
Nickols & Nickols, 1963; Yudin, 1966). 

The other popular method of score conver- 
sion uses regression equations; best known 
perhaps are the regression formulas developed 
by Doppelt from the standardization groups 
for his WAIS short form (Doppelt, 1956). 
There is an important difference between this 
type of regression estimate and Wechsler’s 
Deviation Quotient. The latter uses the nor- 
mative groups as reference groups only, “yard- 
sticks” against which to gauge obtained per- 
formance levels. But a Doppelt-type regres- 
sion estimate assumes, in addition, that the 
subject is a random representative of a pop- 
ulation with the same performance character- 
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istics as the age-appropriate normative group. 
On this basis he is assigned an IQ which 
equals the expected IQ of those members of 
the normative group with whom he has his 
short-form score in common. In other words, 
the normative population not only serves here 
as a reference population, but its distribution 
is also assumed to be comparable to that of 
the subject’s statistical membership popula- 
tion. The reader will recognize that it would 
often be incorrect to make this assumption. 
The “general population” represented by the 
Wechsler normative groups may have virtues 
as a reference group, but its distribution will 
frequently not resemble the distribution of 
our subject’s membership population. There- 
fore, the Doppelt estimate will, for example, 
often “regress to the wrong mean.” This may 
cause systematic underestimates or overesti- 
mates, depending on whether one’s sampling- 
population averages are above or below the 
general-population mean. This type of bias 
may be discernible in certain differences found 
between Doppelt means and Full Scale means 
(e.g., Watson, 1966). Of course, one might 
choose not to treat this type of estimate as a 
regression estimate and view it, instead, as a 
transformation which, in contrast to prorat- 
ing, shrinks the normative standard deviation. 
But this, too, is undesirable. 

It is evident that neither prorating formu- 
las nor regression estimates derived from the 
normative population are satisfactory. The 
authors suggest that both be abandoned, and 
that they be replaced by the familiar Wech- 
sler Deviation Quotient. The DQ has a nor- 
mative mean of 100 and standard deviation of 
15 and, as pointed out earlier, uses normative 
groups as reference groups only. The compu- 
tation of a DQ for any given composite is 
straightforward. The raw scores on the com- 
ponent subtest scores are first converted into 
age-appropriate scaled scores and then 
summed to yield the composite score, X,. The 
normative mean, X,, of our composite equals 
107, while its standard deviation, S., equals 


Sa Nn + 22rn 


where m and rj, are defined as in Formula 1, 
and where S, is the subtest standard deviation 
—which in our case has the uniform value of 
3. 
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TABLE 1 
CONSTANTS FOR CONVERTING WECHSLER COMPOSITE SCORES INTO DEVIATION QUOTIENTS 
2 Subtests 3 Subtests 4 Subtests 5 Subtests 
Dre @ b Erir a b Brje a b Erir a b 
-78-.92 2.6 | 48 2.16-2.58 1.8 46 3.95-4.85 1.4 H 6.96-8.83 1.1 | 45 
.66-.77 2.7 46 1.79-2.15 1.9 43 3.21-3.94 1.5 40 5.50-6.95 1.2 | 40 
.54-.65 2.8 44 1.48-1.78 2.0 40 2.60-3.20 1.6 36 4.36-5.49 1.3. | 35 
.44-.53 2.9 42 1.21-1.47 2.1 37 2.09-2.59 1.7 32 3.45-4.35 1.4 | 30 
.35-.43 3.0 | 40 -97-1.20 2.2 34 1.66-2.08 1.8 28 2.71-3.44 15.) 25 
26-34 3.1 38 -77- .96 2.3 31 1.29-1.65 1.9 24 2.10-2.70 1.6 | 20 
.19-.25 3.2 36 .59- .76 2.4 28 .98-1.28 2.0 20 1.59-2.09 1.7 15 


Upon determining X, and S, we may di- 
rectly compute the Deviation Quotient by 
transforming X, in the following manner: 


DQ = (15/S.) (Xe — Xo) +100 [4] 


Computation of a DQ, while not overly 
demanding, still involves a number of steps: 
determination of X., of 23r} and Se then 
substitution in Formula 4. Tables 1 and 2 
have therefore been prepared to reduce com- 
putational labor. 

Table 1 is a general aid in the conversion 
of scores obtained from any combination of 
two, three, four, or five subtests. First one 
computes žr, the sum of the intercorrela- 
tions of the component subtests, using the 
WISC or WAIS correlation table of the group 
which is closest in age to the subject in ques- 
tion (Doppelt & Wallace, 1955; Wechsler, 
1949, 1955). Computation of Sr, serves to 
locate in Table 1, under the appropriate head- 
ing, the row that contains the needed values of 
two constants, a and 6, The DQ may be ob- 
tained by first multiplying X,, the sum of the 
component subtest scores, by a and then add- 
ing b to aX.. 

An example: Suppose a 30-year-old indi- 
vidual obtains a score of 50 on Doppelt’s four- 
subtest short form. (Since this score is 
obtained by summing the age-appropriate sub- 
test scaled scores on the four component sub- 
tests, it may not be identical to the one needed 
for application of Doppelt’s formula.) Con- 
sulting page 16 of the WAIS manual we find 
Xr to be 3.32. The appropriate row in Table 
1 is, therefore, the second one under the head- 
ing “4 Subtests.” There we find specified for 
a and b the values of 1.5 and 40, respectively. 


Consequently we obtain a DQ of 115 (1.5 x 
50 + 40). 

For users of the Doppelt abbreviation it 
may be of interest that for each of the fol- 
lowing normative age groups we find the re- 
spective values for a and b to be 1.5 and 40: 
18-19, 25-34, 45-54, 60-64, 65-69. Between 
ages 18 and 69 we may, in other words, use 
the same transformation as in the above ex- 
ample. In age group 70-74, a is 1.6 and b is 
36; in the group of 75 and over, a is 1.7 and 
b is 32. 

For many purposes the DQs obtained 
through Table 1 will be accurate enough. At 
two standard deviations below or above the 
mean, none of the DQ conversions rounded to 
integers will be in error by more than one 
point. 


STANDARDIZATION OF FACTOR SCORES 


Among the many possible subtest combina- 
tions, those suggested by factor-analytic re- 
search deserve special consideration. Compos- 
ite factor scales will not only be generally 
more reliable than single subtests, but they 
also promise to yield distinctive and func- 
tionally meaningful performance profiles. 

Table 2 is specifically designed to simplify 
the computation of DQs for the Wechsler 
factor measures that have been proposed by 
Cohen (1957a, 1959). Briefly, from his factor 
analyses of Wechsler normative groups Cohen 
concluded that the correlation matrices ob- 
tained from the oldest WISC group and from 
the WAIS groups in the age range 18-54 
possess similar factorial structures. In addi- 
tion to a General Factor, he isolated in these 
groups three factors which he labeled Verbal 
Comprehension (VC), Memory or Freedom 
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from Distractibility (M-FD), and Perceptual 
Organization (PO). The following subtest 
combinations were proposed for obtaining fac- 
tor scores: for VC, Information, Comprehen- 
sion, Similarities, and Vocabulary; for M- 
FD, Arithmetic and Digit Span; and for PO, 
Block Design and Object Assembly. The Gen- 
eral Factor is well represented by the Full 
Scale. 

It should be added that for the oldest WAIS 
and youngest WISC groups Cohen reported 
deviating factorial findings, but a later analy- 
sis did not confirm this result for the elderly 
groups (Riegel & Riegel, 1962; however, see 
also Berger, Bernstein, Klein, Cohen, & Lu- 
cas, 1964). Further evidence will be needed 
to determine more decisively the age range for 
which the proposed factor measures are ap- 
plicable and useful. At this point it seemed 
reasonable that Table 2 would cover the 13.5- 
year-old WISC group and the full range of 
WAIS age groups. 

Use of Table 2 involves the following steps. 
The raw scores on the component subtests 
must first be converted into age-appropriate 
scaled scores. The compasite scores are then, 
as usual, obtained by summing the scaled 
component subtest scores. Next one turns to 
Table 2 and selects the row of the age group 
to which the subject belongs or to which he 
is closest in age. The row in question specifies 
for each of the three factor scores a certain 
value for a and b. Just as in Table 1, the DQ 
is obtained by first multiplying the composite 
score by a and then adding b to the product. 
Table 2 also shows factor scale reliabilities 
(derived, using Formula 1, from the published 
subtest reliabilities). 

The following brief case description illus- 
trates the use of Table 2. The patient, a 59- 
year-old surgeon, had suffered a stroke more 
than 2 years ago which had left him with a 
mild left hemiparesis. Since the stroke he had 
not practiced medicine and had been essen- 
tially unemployed. The WAIS was adminis- 
tered as part of a comprehensive evaluation, 
in a rehabilitation setting, of the patient’s 
current work potential. The results were: 
Verbal IQ, 120; Performance IQ, 104; Full 
Scale IQ, 114. 

For the Cohen factor scales, the sums of 
the component age-appropriate scaled subtest 
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TABLE 2 


CONSTANTS FOR CONVERTING WECHSLER FACTOR 
SCORES INTO DEVIATION QUOTIENTS (AND 
RELIABILITY COEFFICIENTS) 


vcs M-FD> POe 

Test | Age 
@1b] tec) a |b | te} a | b | te0 
WISC | 134 1.5 | 40 | .93 | 3.0 | 40 | .74 | 2.8 | 44 | .87 
WAIS | 18-19 | 1.4) 44) 96] 2.9 | 42 | .84 | 2.7 | 46 | .86 
25-34 | 1.4 | 44 | .96 | 2.9 | 42 | .82 | 2.8 | 44 | 85 
45-54 | 1.4 |44 | .96 | 2.8 | 44 | .85 | 2,8 | 44 | .86 
60-64 |1.4|44| — | 2.9 |42 | — | 2.7 | 46 | — 
65-69 |1.5|40| — | 2.9 |42| — | 2.7 | 46 | — 
70-74 |1.5|40|— | 2.8 |44| — | 2.8 | 44 | — 
75+ 11.5|40|— |2.7|46|— |2.7 |46| — 


a Verbal Compreheneog Saipe aor maton, Compre- 
heaton: Similar: fee and Vocabulary. 
bM ah from. Detractibility: subtests— 


© Perceptual E e subtests—Block Design and 


Object Assembly. 


scores were: VC, 57; M-FD, 22; PO, 18: For 
conversion into DQs we turn, in view of the 
patient’s age, to Row 5 of Table 2. Applying 
the specified a and 5 values to each of the 
above scores we obtain the following quo- 
tients: VCQ, 124; M-FDQ, 106; POQ, 95. 

The picture presented by the Wechsler IQs 
and the factorial DQs is consistent with the 
impairment of visual-spatial function often 
associated with later-acquired right hemis- 
phere pathology. In this particular case, the 
patient’s differential impairment seems more 
clearly expressed by his factor scores. It is 
not difficult to relate these results to other ob- 
servations—the patient’s well-preserved verbal 
facility, his complaints of concentration diffi- 
culties in the presence of rather mild extrane- 
ous stimuli, the marked impairment of visual- 
spatially mediated motor skills. 


CONCLUDING REMARKS 


This paper aims to reaffirm basic psycho- 
metric principles in the use of “new” Wech- 
sler scales. The authors hope it will contribute 
to improved and more flexible psychometric 
practice. 

With respect to specific subtest combina- 
tions, Doppelt’s short form has acceptable 
reliability and is a good abbreviated measure 
of the General Factor. The authors would, 
however, recommend the modified method of 
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subtest scaling and composite score conversion 
outlined above; the approach is sounder and 
simpler. Even so, we prefer to think of short 
forms as a necessary evil, since we believe 
that present clinical assessment needs call 
for more, rather than less, extensive cognitive 
evaluation (cf. Schofield, 1966). 

Administration of a complete Wechsler 
scale leaves us with the possibility of using 
new subtest combinations. Cohen’s proposed 
factorial measures may be clinically useful 
summaries of part of the Wechsler; they 
could supplement the Full Scale IQ and, pos- 
sibly, even replace the a priori Verbal and 
Performance IQs. There is room, of course, 
for improving our psychological understand- 
ing of these factors and, Particularly in the 
case of M-FD and PO, for further improving 
their measurement, That other distinct and 
clinically important dimensions of cognitive 
functioning exist also cannot be disputed. One 
hopes that future investigations will enable 
us to incorporate the 11 Wechsler subtests 
into a more comprehensive system of cogni- 
tive (ego) functioning. 
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SOME PERCEPTUAL AND COGNITIVE CORRELATES 


OF STRONG APPROVAL MOTIVATION ' 


JOHN M. ROSENFELD 
Kailua, Hawaii 2 


50 male college students were administered 2 measures of field dependence, 
the rod and frame test (RFT) and Thurstone’s embedded figures test (EFT); 
2 tasks to specify kinds of strategies used in reception and selection concept- 
attainment situations; a semantic differential to measure meaning attribution; 
and a delayed auditory-feedback task (DAF) to measure interference with 
speech. Ss with a strong need for approval, as measured by the M-C SD, were 
field dependent on the RFT (p < .05 >.01) as predicted (but not on the EFT, 
contrary to prediction) and had a retarded rate of speech in the DAF task 
(p<.10>.05) as predicted. In a 2nd experiment (N=40), the RFT 
result was replicated (p < .01), as was the DAF result (p < .05 > .01). The 
concept-attainment and meaning-attribution hypotheses were not confirmed. 
The hypothesis that strong approval motivation is developmentally lower than 


weak approval motivation is discussed briefly. 


This study investigated four perceptual and 
cognitive correlates of strong approval motiva- 
tion. Research by Crowne and Marlowe 
(1964) indicates that a person with a high 
need for approval is influenceable, credulous, 
and quite dependent on others for cues. His 
approval seeking leads him to be conforming 
and responsive to minimal social reinforce- 
ment. He is cautious, defensive, and easily 
threatened. This defensiveness is apparently 
due to anticipated threats to his already low 
self-esteem. He avoids situations that threaten 
his self-esteem and uses repressive denial 
defenses against feelings of hostility. 


HYPOTHESES 
Field Dependence-Independence 


A considerable amount of work has been 
done in this area (Witkin, Dyk, Faterson, 
Goodenough, & Karp, 1962). In his latest 
work, Witkin uses three measures of field 
dependence-independence: a body-adjustment 
test, the rod and frame test, and the em- 
bedded figures test. The factor common to 
these tests is the separation of an object (the 
body, the rod, or a geometric design) from 


1This study was part of a doctoral dissertation 
submitted to the Graduate School of Ohio State 
University, The author expresses his sincere appre- 
ciation to Douglas P. Crowne, who served as dis- 
sertation adviser. 

2415 North Kalaheo Avenue, Kailua, Hawaii 
96734. 


the particular field it is embedded in. This 
interpretation led to the formulation of the 
dimension of field dependence-independence. 

It is predicted that subjects with a high 
need for approval (high-need) will be percep- 
tually field dependent. The high-need indi- 
vidual looks outside himself to his environ- 
ment for cues which indicate how to act in 
order to win approval. This is seen in his 
general amenability to outside influence. A 
person with such an extreme reliance on his 
environment will show a reliance on the sur- 
rounding perceptual field. In a task that re- 
quires the separation of an object from the 
perceptual field in which it is embedded, the 
high-need person will have difficulty. He will 
rely more on the field and less on his own 
body cues and sensations. He will be percep- 
tually field dependent. 


Concept Attainment 


The present study is an attempt to sketch 
the relation between the use of concept- 
attainment strategies (Bruner, Goodnow, & 
Austin, 1956) and high approval motiva- 
tion. Bruner et al. used a set of 81 cards 
to study conjunctive concepts by means 
of selection strategies. A conjunctive concept 
was defined (Bruner et al., 1956) as a “set of 
the cards that share a certain set of attribute 
values, such as ‘all red cards’ or ‘all cards 
containing red squares and two borders’ [p. 
83].” Selection strategies refer to the strat- 
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egy used by the subject to attain a concept 
when the complete array of cards is spread out 
before him. 

Bruner et al. (1956) discuss “focusing” and 
“scanning” approaches to concept attainment. 
Focusing strategy refers to choices that vary 
in only “one attribute value from those at- 
tribute values of the focus card that had 
been found relevant or were still untested 
[p. 95].” (The focus card is the card which 
is presented by the experimenter at the be- 
ginning of the task as an example of the con- 
cept he has in mind.) Scanning strategy re- 
fers to choices that vary in more than one 
attribute. 

Another concept-attainment situation is one 
in which the information comes in bits that 
are received serially. The subject must use the 
information as it arrives, rather than have the 
whole universe of information spread out be- 
fore him. This kind of concept-attainment 
situation deals with reception strategies. 

One way of analyzing reception strategies 
is in terms of the initial hypothesis that the 
subject makes. The subject is presented a 
focus card which is an example of the concept 
the experimenter has in mind. The subject is 
instructed to write down his hypothesis as to 
what the concept is. If the initial hypothesis 
includes all the attributes of the focus card, 
this is considered to be a wholist strategy. 
A partist strategy is one in which any lesser 
number of attributes is included in the 
hypothesis.® 

What strategies would the high-need person 
use? Although hypotheses concerning this 
question do not follow unambiguously from 
Prior research and theory, it seems plausible 
to expect that global, nonanalytic strategies 
will be used. Thus, the hypothesis is that 
high-need subjects will use scanning and 
wholist strategies in concept attainment,* 


3 Bruner et al, (1956) use the terms “wholist” and 
“partist” to refer only to the number of attributes 
contained in the initial hypothesis. The terms should 
not be construed as having any Gestalt or configu- 
rational implications, 

*Further, the high-need individual’s handling of 
a situation with an experimenter present should 
focus primarily on the interpersonal nature of the 
situation, even though there is a specific and con- 
crete problem to be solved. Rather than assuming a 
task orientation, he should structure the situation in 


Meaning Attribution 


If the high-need subject uses nonanalytic, 
global strategies in concept-attainment tasks 
and is field dependent, it seemed plausible to 
expect a similar approach in assigning mean- 
ings to concepts. An ideal tool for studying 
this problem is the semantic differential (Os- 
good, 1952). If the hypothesis is correct, two 
predictions should be confirmed. Such a per- 
son will be unlikely to use the extreme scale 
points (Numbers 1 and 7) on each of the 
scales. Checking a concept on the extreme end 
of a meaning scale implies making a finer, 
less global discrimination. A global approach 
to meaning would entail using the categories 
that are closer to the midpoint of the mean- 
ing scale. The second prediction is that high- 
need subjects will be more likely to use the 
center category on the semantic differential 
than low-need subjects. 


Interference with Speech Processes 


Delayed auditory feedback (DAF; Yates, 
1963) refers to a situation in which a person 
hears his own voice fed back to him with a 
very short time delay. An individual’s speech 
is often seriously affected under such circum- 
stances, 

The vulnerability of the high-need indi- 
vidual to threatening situations suggests that 
in the DAF situation he will show disruption 
in his speech. The omnipresent fear of being 
negatively evaluated and this threat to the 
individual’s self-esteem will make the DAF 
task rather threatening. The hypothesis is 
that high-need subjects, when compared to 
low-need subjects, will have a slower rate of 
speech in the DAF situation. 


terms of seeking approval from the experimenter and 
in being positively evaluated. What can he do to 
receive the approval of the experimenter? The an- 
swer is to attain the concept as quickly as possible. 
If this analysis is correct, then this type of person— 
in his desire to finish the problem as quickly as 
possible and be positively evaluated—should use a 
more global approach, one that attempts to handle 
as many cues as is possible at any one time. The 
subject’s reasoning here might be that if he can 
handle many or all of the apparently relevant cues 
at once, then the concept can be attained faster; 
with the quick attainment of the concept would, 
Presumably, come the positive evaluation of the ex- 
perimenter. 
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METHOD 
Subjects 


The Ss were 50 male undergraduate students en- 
rolled in the introductory psychology course at Ohio 
State University. They volunteered to participate in 
order to satisfy a course requirement, All Ss were 
freshmen and sophomores, with mean age 19 years, 
4 months. 


Marlowe-Crowne Social Desirability Scale 
(M-C SD) 


The M-C SD served as the measure of the need 
for approval (Crowne & Marlowe, 1960, 1964). This 
scale consists of 33 items judged to reflect culturally 
approved, acceptable behaviors which are unlikely to 
occur. To control for order effects, the M-C SD and 
the experimental tasks were administered so that 
each appeared an equal number of times in each 
administration position. 


Field Dependence-Independence 


This dimension was studied with the rod and 
frame test (RFT) and the embedded figures test 
(EFT). As adapted from the original apparatus of 
Witkin et al. (1962), the RFT consisted of a 30- 
inch square luminous frame pivoted so that it can 
be tilted to the left or the right. Within the frame 
and pivoted independently of it was a 26-inch 
luminous rod. The rest of the RFT apparatus was 
painted a dull black color. The test was adminis- 
tered in a windowless dark room that was also 
painted dull black. 

The standard procedure is to tilt the frame and 
have the S adjust the rod to what he believes is 
the true vertical. The Æ makes the adjustments of 
the rod according to the S’s instructions. A large 
degree of tilt from the true vertical when the S 
reports the rod to be vertical indicates dependence 
on the visual field. A small degree of tilt indicates 
less dependence on the visual field and a greater 
reliance on bodily cues and sensations,® 

The S was brought to the experimental room and 
the standard instructions (Witkin et al. 1962) were 
read to him. The S entered the room and sat in an 
armless chair 7 feet away from the apparatus. One 
Practice trial was given, followed by eight test 
trials. The tilt of the frame and the rod was varied 
throughout the test series. 

The S told the Z in which direction to move the 
rod and where to stop. The field-dependence score 
was the mean absolute error, in degrees, from the 
true vertical for the eight trials. 


5 The RFT has typically been administered in con- 
junction with a tilting chair. The standard test 
consists of three series of trials. In two series the S 
is tilted to the left and right and in the third series 
the S is vertical. Witkin et al. (1962) have re- 
Ported, however, that the scores for the upright 
series alone can be used as a substitute for the 
total series with no loss in validity. The tilting chair 
Was not used in this study. 


The EFT was four pages of the Thurstone figures 
(Thurstone, undated). The task is to locate a design 
that is concealed in a larger and more complex 
design. In order to perform adequately, the effect of 
the embedding context must be overcome, The S 
was given a sample set of designs and the standard 
Thurstone instructions were read to him. 

Two scores were computed for each S: the num- 
ber of problems completed within the 5-minute 
limit and the percentage of attempted problems that 
were solved correctly. 


Concept Attainment 


Selection strategies. A set of 54 8- by 10-inch 
white cards on which there were small cutout figures 
was used. The cards varied in four attributes: the 
shape of the figure (square, triangle, or rectangle), 
the number of figures (one, two, or three), the 
color of the figures (black, red, or green), and the 
number of black corners marked on the card itself 
(one or two). 

The array of cards was spread out on a table and 
remained in sight while the following instructions 
were read to the S; 


In this part of the study we are interested in 
how people form concepts, In front of you are a 
large number of concept formation cards. For the 
purposes of this study, we will define a concept as 
a set of the cards that share a certain set of 
values, such as “all black cards” or “all cards 
containing red squares and two black corners.” 

For example—show me the cards that fit the 

concept “two black triangles.” (S pointed to the 

cards that had two black triangles on them.) Now, 
this is an important point and you must under- 
stand it in order to work the problems correctly. 

The concept “two black triangles” also appears in 

two other cards, here and here. (Z pointed to the 

cards containing three black triangles.) Do you 
see why? The concept “two black triangles” is 
contained within the cards with three black tri- 
angles on them. We have a concept in mind and 
certain of the cards in front of you illustrate it 
while some of the cards do not, Your task is to 
determine what this concept is. We will begin by 
showing you a card that illustrates the concept— 
a positive or correct example of the concept. Your 
job is to choose cards for testing, one at a time, 
and after each choice we will tell you whether the 
card is a correct or incorrect instance of the con- 
cept. You may offer an hypothesis after any 
choice of a card—that is, your best guess as to 
what the concept is—but you may not offer more 
than one hypothesis after any particular choice. 

If you do not wish to offer any hypothesis, you 

need not do so. You may select the cards in any 

order you choose. Try to form the concept as 
efficiently as possible. 

In a number of cases the S was unsure about the 
nature of the task. The E reexplained what the S 
was supposed to do, sticking as closely as possible to 
the original instructions, 
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The £ then presented the first focus card on 
which were two green squares and one black corner. 
The first concept was “two squares.” The second 
focus card contained three red triangles and two 
black corners, and the second concept was “two red 
triangles.” The third concept was “one black tri- 
angle,” and its focus card contained two black 
rectangles and two black corners, 

The cards that the S chose and his hypotheses 
were recorded. The choices were examined for kind 
of strategy used. Following Bruner et al. (1956, P. 
95), “a focuser was defined as a subject whose 
choices in the main varied only in one attribute 
value from those attribute values of the focus card 
that had been found relevant or were still un- 
tested.” If a majority of the choices were of this 
type, the S was considered to be a focuser; if the 
majority of the S’s choices varied in more than one 
attribute value, he was considered a scanner. 

Reception strategies. Similar cards were used here. 
The following instructions were read to the S: 


In this experiment, we'll be using cards like this. 
Each card varies in four ways. It can have one, 
two or three figures on it; they can be rectangles, 
squares, or triangles; they can be black, red or 
green; and they can have two or three black 
corners on them. Here we are interested in how 
people form concepts. A concept, for our purposes, 
is a set of the cards that share a certain set of 
values, such as “all black cards” or “all cards 
containing red squares and two black corners.” I 
will pick a concept that I want you to figure out. 
Then I will show you a series of cards, one at a 
time. I will tell you whether the card is a posi- 
tive or correct example of the concept or a nega- 
tive or incorrect example of the concept. After I 
tell you whether the card is positive or negative, 
you write down your best guess as to what the 
concept is on the answer sheet. Do this for every 
card. When you think you have figured out what 
the concept is, tell me. I will continue showing you 
cards until you tell me to stop. Cover each answer 
with this sheet of paper after you write it down 
and please do not refer to what you have written 
previously. 


Three sequences of cards were presented. Each 
Sequence consisted of eight cards chosen by the E 
to exemplify a concept. The S’s task was to figure 
out what this concept was, Each sequence contained 
four cards on which the concept appeared and four 
cards on which it did not appear. The E presented 
each card to the S and told him whether or not the 
concept appeared on it. The S then wrote down his 
hypothesis. The E continued to show cards until the 
S gave the correct concept or until all eight cards 
were shown. 

The initial hypothesis for each of the three con- 
cepts was examined. A wholist strategy included all 
the attributes of the focus card; a partist strategy 
included any lesser number of attributes. If two of 
three or all three of the initial hypotheses were 
wholist hypotheses, the $ was considered to be a 


wholist; if two or three were partist hypotheses, the 
S was considered a partist. 


Meaning Attribution 


The semantic differential used to study meaning 
attribution was designed by Haffer 6 and consisted of 
the following concepts: children, abstract art, beat- 
nik, Columbus (Ohio), psychology, mother, father, 
me, best friend, and most disliked other person. The 
standard instructions taken from Osgood, Suci, and 
Tannenbaum (1957) were read by the S and the 
rating made. 

Two scores were computed: the number of times 
Categories 1 and 7 were used and the number of 
times Category 4 (center) was used. 


Interference with Speech Processes 


For this section of the experiment, a Hunter DAF 
apparatus and a Bell and Howell tape recorder were 
used. There were two conditions: delay, in which the 
S read a 250-word textbook passage into a micro- 
phone and heard his voice being fed back to him 
through earphones .05 of a second later; and no- 
delay, in which the S read the same passage into 
the microphone under normal conditions, The two 
conditions were randomized throughout the group in 
order to obviate the effects of reading the passage 
twice and thus gaining familiarity with it. For each 
of the two conditions, the S’s voice was recorded 
and the time taken to read the passage was noted. 

The measure of slowed speech was the time taken 
to read the passage in the delay condition minus the 
time taken to read it in the no-delay condition. 


RESULTS 


Scores on the M-C SD were dichotomized at 
the median (15) to form high- (W = 25) and 
low-need (NV = 25) groups. 


Field Dependence-Independence 


Two measures, the RFT and the EFT, 
were involved in the field-dependence hy- 
pothesis. On the RFT the high-need group 
earned a mean score of 3.46, and the low- 
need group earned a mean of 2.41. The dif- 
ference was significant (¢ = 2.44, p < .05 > 
.01).7 

This finding was replicated in a second ex- 
periment, using the same method and sam- 
pling the same subject population (¢ = 2.94, 
df = 38, p < 01). 

On the EFT the mean scores of the high- 
and low-need groups were compared by ¢ 
test. There were no significant differences’ for 
either measure: the number of problems 


6C. Haffer, personal communication, March 1963. 
TAI tests of confidence are two-tailed tests. 
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solved (¢ = .24, df = 48) and the percentage 
of problems solved correctly (t= .51, df = 
48). The hypotheses were not confirmed. 


Concept Attainment 


Selection strategies. The hypothesis that 
high-need subjects would use scanning strate- 
gies was tested by chi-square. The obtained 
value of 2.09 is not significant. Only 13 of 
the 25 high-need subjects used a scanning 
strategy. It is worth noting, however, that 18 
of 25 low-need subjects used a focusing 
strategy. 

Reception strategies. The hypothesis that 
high-need subjects would use a wholist recep- 
tion strategy was tested by chi-square. The 
obtained value of 2.88 falls at the .10 confi- 
dence level: The hypothesis was not con- 
firmed. 


Meaning Attribution 


Two semantic differential scores were used 
to test the hypotheses that high-need subjects 
would use nonanalytic approaches to meaning 
attribution, There were no significant differ- 
ences for either measure—use of Categories 1 
and 7 (t = .17, df = 48) and use of Category 
4 (t= .96, df= 48). The hypotheses were 
not confirmed. 


Interference with Speech Processes 


It was hypothesized that high-need sub- 
jects would be more affected in the DAF situ- 
ation and would manifest a retarded rate of 
speech. The mean DAF scores were 19.48 for 
the high-need group and 12.72 for the low- 
need group. The means of the two groups 
were compared by ż test. The hypothesis was 
confirmed at a borderline confidence level (¢ 
= 1.80, p< .10 > .05). 

This finding was replicated in a second ex- 
periment using the same method and sampling 
the same subject population (¢ = 2.41, df = 
38, p < .05 > 01). 


Discussion 


The predicted relationship between strong 
approval motivation and field dependence, as 
Measured by the RFT, was found in two stud- 
ies. These findings have been replicated by 
Cooper (1964). Thus, it seems clear that indi- 
Viduals with a high need for approval are 
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field dependent on the RFT. However, the 
failure to find a relationship between approval 
motivation and EFT scores is troublesome. 
Witkin and his associates (Witkin et al., 
1962; Witkin, Lewis, Hertzman, Machover, 
Meissner, & Wapner, 1954) report that the 
RFT and EFT are highly correlated and 
measure the same underlying dimension of 
field dependence-independence. The most 
serious methodological problem in this study 
is that the Thurstone EFT was used, rather 
than the more complex Witkin figures. A 
number of investigators believe the Thur- 
stone figures do not tap the field-dependence 
dimension nearly as well as the Witkin fig- 
ures.® The next obvious step would be to test 
the relationship between scores on Witkin’s 
EFT and the M-C SD with the expectation 
that strong approval motivation would be 
associated with EFT field dependence. 

Neither of the two concept-attainment hy- 
potheses was confirmed. There are some ten- 
tative suggestions from these results, how- 
ever, that people with varying strengths of 
approval motivation may use different modes 
of cognition in dealing with their worlds. This 
lead would be worth following using other cog- 
nitive measures. Further research could con- 
firm a cognitive style or styles as associated 
with strong approval motivation. 

The hypotheses concerning semantic dif- 
ferential performances were not confirmed. It 
is obvious that the problem of meaning at- 
tribution needs to be reconsidered. The Os- 
good et al. (1957) factor analysis of semantic 
differential scales elicited a main “evaluative” 
factor. It may be that this factor could dis- 
criminate between high- and low-need groups. 
Evaluation is, possibly, a key axis of reference 
in the life of a strongly approval-motivated 
person. There is a continual weighing of life 
experience on the evaluative scale of what is 
good (approval-eliciting) and what is bad 
(non-approval-eliciting). The prepotency of 
this axis in the person’s life may lead to per- 
formance on the semantic differential that fits 
his evaluative scheme. 

Two studies have shown that approval- 
motivated individuals have a retarded rate of 


8D. Broverman, personal communication, October 
1963; V. Crandall, personal communication, Septem- 
ber 1963. 
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speech in the DAF situation. This finding 
lends further credence to the research that has 
shown the vulnerability of high-need people 
to threatening and stressful situations (Crowne 
& Marlowe, 1964). 

Any consistent and organismic view of per- 
sonality seeks theoretical and empirical con- 
sistencies among its constructs. Thus, an ade- 
quate theory would ultimately seek consisten- 
cies between motivational variables and 
perceptual-cognitive variables. The relation- 
ships found between strong approval motiva- 
tion and field dependence and DAF yield some 
insights in this direction. 

Another implication of this study is the 
possible developmental ordering of high- and 
low-need individuals. Field dependence is 
considered to be of lower developmental level 
than field independence (Witkin et al., 1962). 
Since the high-need individual is field de- 
pendent, it is possible that strong approval 
motivation is developmentally lower than 
weaker approval motivation. Such a relation- 
ship is further implied by the consistent 
findings that the high-need person is depend- 
ent on the external world rather than on 
internal cues and norms. These findings sug- 
gest that the high-need person may have some 
degree of difficulty in differentiating himself 
from his environment. This is one of the 
prime indicators of lower developmental lev- 
els—that is, lack of subject-object differenti- 
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ation (Werner, 1948; Witkin et al., 1962). 
This lead would be well worth investigating 
using other measures of psychological devel- 
opment as criteria. 
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INTERVIEW INTERACTION OF REPRESSORS AND 


SENSITIZERS? 


MARTIN F. KAPLAN 


Northern Illinois Universit y 


Research with quantitative variables in the interview has demonstrated that 
interviewee verbal patterns can be modified by manipulation of interviewer 
verbal patterns. It was predicted that sensitizers and Tepressors, possessing 
idiosyncratic response styles, would as interviewers differ in their behavior 
and that these differences would result in different verbal behavior by their 
respective interviewees, Repressor, sensitizer, and neutral interviewers were 
paired with neutral interviewees in a 15-min. interview. Sensitizer interviewers 
were found to take more of an active role, and their interviewees evidenced less 
activity than those of repressors and neutrals. The effect of interviewer person- 
ality on both interviewer and interviewee speech, as well as the role of inter- 


view structure, is discussed. 


In the past 25 years, a good deal of re- 
search has shown that the verbal patterns of 
interviewees can be influenced by those of 
their interviewers (e.g., Chapple & Arensberg, 
1940; Goldman-Eisler, 1952; Greenspoon, 
1955; Matarazzo, Wiens, & Saslow, 1965). 
Such interviewee parameters as total time 
Spent talking, average length of verbal unit, 
periods of silence, etc., have been shown to be 
modifiable by manipulation of the respective 
interviewer parameters (Matarazzo et al., 
1965, pp. 193-205). 

Research in the personality dimension of 
repression sensitization has led some investi- 
gators to the conclusion that persons attaining 
extreme scores on the continuum possess idio- 
syncratic response styles in test performance 
(Altrocchi, 1961; Byrne, 1964; Byrne, Go- 
lightly, & Sheffield, 1965; Kaplan, in press; 
Lucky & Grigg, 1964). Sensitizers display 
More negative self-concepts (Altrocchi, 1961), 
less self-acceptance (Spanner, 1961), and re- 
Port more maladjustment (Byrne, 1961; 
Byrne et al., 1965; Lucky & Grigg, 1964) 
than do repressors, While sensitizers evidence 
Self-descriptions tending towards maladjust- 
ment and deviance, repressors’ self-descriptions 
tend toward denial of these traits. Repression 
Seems to be inversely related to measures of 
maladjustment, social desirability, and anxi- 
ety (Byrne, 1964; Liberty, Lunneborg, & At- 


1 This research was supported by Research Grant 
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kinson, 1964). In the present study, it was 
predicted that these differential response styles 
on paper-and-pencil self-report tests would be 
reflected in differential verbal behavior in an 
interview situation. Since variations in the 
verbal patterns of interviewers have been dem- 
onstrated to lead to variations in those of 
interviewees (Matarazzo et al., 1965, pp. 193- 
205), it was further predicted that repressor 
and sensitizer interviewers would elicit dif- 
ferent verbal behaviors from their respective 
interviewees, 


METHOD 
Subjects 


The Ss were 120 male undergraduate students 
enrolled in an introductory psychology course at 
Northern Illinois University. These Ss were selected 
from an initial pool of 461 male students who were 
given the revised Byrne Scale of Repression-Sensiti- 
zation (Byrne, Barry, & Nelson, 1963). The S dis- 
tribution was divided into quartiles, and the upper 
quartile was designated as sensitizer (score range 
53-120), the lower quartile as repressor (score range 
4-24), and the middle quartiles as neutral. The S 
groups were filled by randomly selecting sensitizers, 
repressors, and neutrals from their respective groups. 


Procedure 


Sixty neutrals were randomly selected as interview- 
ees; they were then randomly paired with 20 re- 
Pressors, 20 sensitizers, and 20 neutrals, all desig- 
nated as interviewers. Prior acquaintanceship of 
interviewer and interviewee was controlled in two 
ways: First, care was taken to select each member 
of a given pair from different course sections; sec- 
ond, Ss were asked before each interview whether 
they were acquainted with their partner. Repres- 
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TABLE 1 


INTERVIEW INTERACTION SCORES FOR REPRESSOR, 
SENSITIZER, AND NEUTRAL INTERVIEWER GROUPS 


Interviewer group 
Measure 
Repressor | Sensitizer| Neutral |F ratio 

R-S Score 

Interviewer 13.05 75.55 40.60 

Interviewee 42.00 39.95 41.20 
Units 

Interviewer 71.65 68.64 70.53 | .10 

Interviewee 70.24 68.18 70.18 | .04 
Total time* 

Interviewer 241.20 | 341.90 | 280.34 | 5.19** 

Interviewee 537.66 | 440.68 | 539.01 | 4.22* 
M time per unit 

Interviewer 3.59 5.33 4.19 | 5.46** 

Interviewee 7.66 6.46 7.68 | 4.73 
References» 

Self 64,47 63.24 | 65.26 | .07 

Parent 1,52 2.99 3.38 | 1.91 

Sibling 1,32 1.74 2.58 | .61 

Male peer 4.35 2.60 4.64 | 1.57 

Female peer 5.22 5.16 4.15 31 


a Expressed in seconds. 
b Expressed in percentage of units. 
*p <05. 

** p <.01, 


sion-Sensitization scale means for the three inter- 
viewer groups were 13.05 (Repressors), 40.6 (Neu- 
trals), and 75.55 (Sensitizers) and 42.0, 41.2, and 
39,95, respectively, for their interviewees. 

Interviews were observed from an adjacent room 
by means of a one-way-vision mirror. The follow- 
ing instructions were read to both Ss: 


The purpose of this experiment is to determine 
how well people can get to know a relative stranger 
in a short time. You [designating the interviewer] 
are the interviewer. It will be your job to find out 
as much as possible about the person you will be 
interviewing. You will have 15 minutes to do this. 
You may ask any questions or discuss any topics 
you wish aimed at getting to know the other 
person as well as possible in this short period of 
time. To preserve anonymity, the other Person’s 
name need not be mentioned in the interview. The 
interview will be taped, and observed through a 
one way vision mirror. However, no one other 
than myself will be able to observe or listen to 
the interview as it is progressing. When the 15 
minutes are up, you will be notified. Are there any 
questions? 


[To interviewee] You are the person to be inter- 
viewed. Your task is simply to answer these ques- 
tions as honestly and completely as possible. Al- 
though the interviewer is not limited as to type of 
questions or topics, you may decline to answer if 
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you feel the question would require too personal 
an answer. Are there any questions? 


The interview was scored by an observer on a 
device? similar to the Interaction Chronograph as 
described by Matarazzo, Saslow, and Matarazzo 
(1956). The device consisted of two Hunter Kloc- 
kounters, each wired to a four-digit counter and 
activated by a telegraph key which was pressed for 
the duration of utterance of either interviewer or 
interviewee. In this way, total time spent speaking 
and number of speech units for interviewer and 
interviewee were obtained separately. 


Analysis 


Eleven interview parameters were computed and 
analyzed within a simple randomized analysis-of- 
variance design. The parameters were as follows: 
(1) number of interviewer units, (2) number of 
interviewee units, (3) total time of interviewer 
speech, (4) total time of interviewee speech, (5) 
average time per interviewer unit, (6) average time 
per interviewee unit, (7) percentage of interviewee 
units with reference to self, (8) percentage of inter- 
viewee units with reference to parents, (9) percent- 
age of interviewee units with reference to siblings, 
(10) percentage of interviewee units with reference 
to male peers, (11) percentage of interviewee units 
with reference to female peers. 

Due to mechanical failures in the equipment, 3 
interviews in each interviewer group were elimi- 
nated; hence, the final analysis included 17 inter- 
views in each group. 


RESULTS AND DISCUSSION 


Interview interaction scores for the repres- 
sor, sensitizer, and neutral interviewer groups 
and their respective F ratios are presented in 
Table 1. 

It is clear that the three groups of inter- 
viewers differed with regard to how active a 
role they took in the interview, as measured 
by the time they spent talking. The sensitiz- 
ers seem to have spent far more time verbal- 
izing than their repressor and neutral col- 
leagues, and the effect of this was to diminish 
the verbal output of their interviewees. While 
the personality of the interviewer appears to 
be related to his quantitative output and that 
of the person he is interacting with, it did not 
significantly affect the content of the utter- 
ances of the interviewee, at least with regard 
to the utterances within each of the five con- 
tent areas investigated here. The three treat- 


2 The author wishes to express his appreciation to 
J. Brown Grier for his assistance in the construction 
of the device. 


INTERVIEW INTERACTION OF REPRESSORS AND SENSITIZERS 


TABLE 2 


COMPARISON OF MEAN DURATION or UTTERANCES 
REPORTED BY MATARAZZO, WEITMAN, SAsLow, AND 
Wiens (1963) AND BY THE PRESENT INVESTIGATION 


Matarazzo et al.* Kaplan? 


Interviewer | Interviewee | Interviewer | Interviewee 


5.3 24.3* 3.6 giret 
99 46,9* 5.3 6.4%eed 
6.1 26.6* 4.2 VI iain 


* One interviewer; 3-15-minute period each, 
b 15-minute period each. 
e Repressor, 
d Sensitizer, 
e Neutral. 
*p <05. 
KD < 001. 


ment groups did not differ in relative propor- 
tion of references to the self, parents, siblings, 
or male and female peers. Similar results in a 
somewhat different setting are reported by 
Sarason and Winkel (1966), who found in a 
self-interview situation that the hostility level 
of the experimenters made no difference in 
subjects’ total units, references to peers, or 
references to self, Hostility was measured by 
a self-report scale which measures the tend- 
ency to say negative things about oneself. 
The results support the findings reported 
by Matarazzo, Weitman, Saslow, and Wiens 
(1963) that length of interviewer utterance 
is a determinant of length of interviewee ut- 
terance. It is interesting to note, however, 
that the present study found the opposite ef- 
fect of that reported by Matarazzo et al. 
(1963). The present study found a comple- 
mentary effect of interviewer speech duration 
upon interviewee speech duration, that is, the 
longer the group mean duration of interviewer 
speech, the shorter the mean utterance 
elicited from each group’s respective inter- 
viewees. Matarazzo et al. (1963) found 
longer interviewer speech durations to have 
the opposite effect upon interviewee utter- 
ances, a parallel effect of one upon the other. 
The differences between the two studies are 
noteworthy. While Matarazzo et al. used the 
Same professional interviewer, the present 
Study used three groups of student inter- 
viewers. Further, the length of interviewer 
Speech unit in the Matarazzo study was ma- 
nipulated by experimental instructions, and 
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the interviewer was instructed to be non- 
directive. No such restraints were placed upon 
present interviewers, and interviewer speech 
duration was self-regulated. When one com- 
pares the average speech duration of Mata- 
razzo’s interviewees to those obtained in the 
present study (Table 2), one may conjecture 
that the utterances of Matarazzo’s interviewer 
were more nondirective and open-ended than 
those of the present interviewers and, thus, 
permitted longer responses. Hence, it may be 
that complementarity of interviewer and 
interviewee speech duration is to be found 
in relatively directive interviews and a paral- 
lel effect found in nondirective interviews. 

Research findings reported here suggest 
that the personality of the interviewer can 
affect the verbal behavior of the interviewee 
through the medium of the interviewer’s 
speech patterns, 
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Earlier research has suggested that the commonly used operational definitions 
of overinclusive thinking do not measure the same variable, since studies 
using Epstein’s inclusion test and Payne’s overinclusion battery have produced 
divergent results. To study this possibility, Epstein scores, 2 measures used by 
Payne as definitions of overinclusion (number of words used to interpret 
Benjamin’s proverbs and number of objects “handed over” during 4 trials on 
the Goldstein object-sorting test), and 2 measures of visual-spatial stimulus 
generalization were collected on 100 male schizophrenics. With the exceptions 
of the built-in correlations between the stimulus-generalization measures and 
between Epstein’s over- and underinclusion scores, only 2 of the 15 inter- 
correlations were significant at the .05 level, and both of these in the direc- 
tion opposite to that hypothesized. The ramifications of these findings for 


overinclusion research are discussed. 


In recent years a considerable amount of 
interest has been generated in a schizophrenic 
thinking disorder called “overinclusion.” 
Cameron (1939, 1944) originated the term to 
describe a “remarkable inability to maintain 
the boundaries of a problem and restrict 
[one’s] operation within its limits.” As ex- 
amples of this phenomenon, Cameron cited 
the tendency of schizophrenics to include with 
test stimuli in an object-sorting task such 
irrelevant items as the experimenter’s clothing 
and watch, the desk blotter, the walls of the 
room, and even persons passing by outside 
the window. On a clinical level, overinclusion 
might also be conceptualized as including 
tangentiality and delusional thinking. 

Subsequently, an array of overinclusion test 
definitions appeared. Epstein (1953) opera- 
tionally defined this variable by a 50-item 
Paper-and-pencil inclusion test. Each item 
consisted of a key word (e.g., man) and five 
foils. The subject was instructed to underline 
those foils which were necessary parts (e.g., 
arms, toes, head) of the concept described by 
the key word. The underlining of unnecessary 
choices (shoes, hat) was defined as overinclu- 
sion, while failure to mark the required terms 
was called “underinclusion.” Epstein’s test 

1 This study was supported by the Veterans Ad- 
ministration Psychiatric Evaluation Project, Lee 
Gurel, director. The contributions of William G. 
Klett, Kathleen Schelonka, Catherine Tidd, and Bill 
Skay are gratefully acknowledged. 


has been widely used by researchers (Desai, 
1960, who described the research of Ley; 
Eliseo, 1963; Epstein, 1953; Payne & Hew- 
lett, 1960; Payne & Hirst, 1957; Payne, 
Mattussek, & George, 1959; Rablen, 1958; 
Watson, in press; Whittier, Klein, Levine, & 
Weiss, 1960). In addition, a revision of this 
test has recently been developed by Sturm 
(1963). 

Another set of definitions has been given 
the term “overinclusion” by Payne and his 
associates. Payne and Friedlander (1962) 
have defined the term as a composite score 
based upon (a) the number of words used 
in explaining Benjamin’s proverbs (1944), 
(6) the number of incorrect sortings produced 
on the Weigl-Goldstein-Scheerer Color Form 
Sorting Test, and (c) the number of objects 
included in four trials on the Goldstein- 
Scheerer Object Sorting Test. The rationale 
for the “proverbs” definition was that the 
overinclusive individual, unable to exclude 
unnecessary associations from his answers, 
would produce wordier definitions than the 
normal, The use of number of incorrect color- 
form sortings was justified on the grounds 
that overinclusive individuals, using both rele- 
vant and nonrelevant cues, should be more 
prone to make incorrect sortings than nor- 
mals. Number of Goldstein object-sorting 
stimuli included was employed on the as- 
sumption that this, too, would reflect in- 
ability to exclude unnecessary stimuli from 
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a concept. The Payne definitions of over- 
inclusion have also been used extensively in 
research work on schizophrenic thinking 
(Payne, Caird, & Laverty, 1964; Payne & 
Friedlander, 1962; Payne, Friedlander, Lav- 
erty, & Haden, 1963; Payne & Hewlett, 1960; 
Payne, et al., 1959). 

It is disturbing to note that a number of 
papers suggested that the Payne and Epstein 
definitions of overinclusion are not measuring 
the same variable. In a variety of studies, 
Epstein’s overinclusion has been shown to 
be unrelated to the process-reactive dimension 
(Eliseo, 1963); length of hospitalization 
(Watson, in press); and, probably, the pres- 
ence of delusions, to the extent that they are 
reflected by the F, Paranoia, and Schizo- 
phrenia scales of the MMPI (Watson, in 
press) and to be negatively correlated with 
IQ (Watson, in press). Overinclusion has 
also been reported with some frequency 
‘among emotionally disturbed nonschizophrenic 
groups, including psychotic  depressives 
(Payne & Hirst, 1957), neurotics and charac- 
ter disorders (Desai, 1960; Payne & Hew- 
lett, 1960; Payne, et al., 1959), and medical- 
surgical patients (Eliseo, 1963). 

In marked contrast to these findings, 
Payne’s definitions are apparently unaffected 
by IQ and are found only in schizophrenics 
(Payne & Hewlett, 1960). Furthermore, his 
overinclusion measures are more prominent 
among acute (reactive?) and delusional 
schizophrenics (Payne, 1962; Payne, et al., 
1964) than among their chronic (process?) 
and undeluded counterparts. These highly 
conflicting descriptions of the correlates of 
overinclusion strongly suggest that the Payne 
and Epstein definitions do not measure the 
same variable, even though both go by the 
same title. 

Tt is also interesting to note the similarity 
between Cameron’s original concept of over- 
inclusion and the notion of stimulus generali- 
zation (SG). It seems likely that laboratory 
subjects responding to items resembling a 
standard stimulus are manifesting the same 
“remarkable inability to maintain the 
boundaries of a problem and restrict [one’s] 
operations within its limits [Cameron, 
1944].” Cameron dubbed this overinclusion. 
Indeed, several studies have demonstrated 
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the relationship of SG to anxiety (M. T. 
Mednick, 1957; S. A. Mednick, 1957; 
Rosenbaum, 1953), hysteria (Eriksen, 1954), 
and schizophrenia (Garmezy, 1952; Knopf & 
Fager, 1959). These results bear some re- 
semblance to those dealing with the Payne 
and Epstein overinclusion measures and fur- 
ther suggest the possibility that SG may be 
related to at least some of the several over- 
inclusion definitions. 

The purpose of the present study is to 
help clarify the confusion revolving around 
overinclusion by determining the relationships 
between several definitions of overinclusion. 


PROCEDURE 


Subjects. The Ss were 100 male schizophrenics 
under the age of 45 without evidence of organic 
brain damage. All were randomly selected from a 
pool of eligibles at the Veterans Administration 
Hospital, St. Cloud, Minnesota. The age, education, 
length of hospitalization, and Revised Beta IQ 
means of the sample were, respectively, 37.96 years 
(SD = 3.56), 11.16 years (SD = 2.56), 100.97 months 
(SD = 66.21), and 89.49 (SD = 15.15). 

Overinclusion measures. Six previously used or 
potential overinclusion definitions based on four tests 
were employed in the present study. They were 
overinclusion and underinclusion scores from Ep- 
stein’s test, number of words used to interpret Benja- 
min’s proverbs, number of objects handed over in 
four trials on Goldstein’s object-sorting test, and 
two stimulus-generalization measures. 

A visual-spatial SG apparatus was fabricated for 
use in this project. The device was patterned after 
that developed by Brown, Bilodeau, and Baron 
(1951). It consisted of a horizontal row of seven 
white 7-watt lights mounted on a black curved 
panel 72 inches long and 30 inches high mounted 
on a table. The lights were placed at eye level and 
were uniformly spaced at 8-inch intervals, Due to 
the curvature of the board, all lights were equidistant 
(5 feet) from the S’s eyes. A red 7-watt bulb located 
8 inches above the center white light served as a 
ready signal. A reaction-time device was activated 
by a switch mounted in a plastic case and held 
by the S in such a manner that the button could 
be depressed by his thumb. Seated behind the board, 
the examiner could turn on any of the seven lights 
following the illumination of the red ready signal. 
Frequency of response to all lights and latency to 
the nearest .01 second were recorded. Reaction time 
was measured by an electric timer, but was not 
considered in the present study. 

The apparatus was situated in a basement room. 
The windows were covered with opaque shades 
which were taped to the window frame to minimize 
the amount of light in the testing room. Two 
shaded 15-watt fluorescent lights provided illumina- 
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TABLE 1 
INTERCORRELATIONS BETWEEN OVERINCLUSION MEASURES 
Measure 2 3 4 5 6 

1. Epstein overinclusion —.241* —.081 —.025 —.058 —.001 

2. Epstein underinclusion —.066 .090 .257* Tre 

3. Proverbs word count .007 —.012 —.027 

4. Number of object-sorting items 157 144 

5. SG errors 967** 

ó, 


. SG weighted score 


tion for the examiner and provided some light for 
the S as well. 

Each S was read a set of instructions describing 
the task as a reaction-time test. He was encouraged 
to react as quickly as possible to the lighting of 
the central lamp by pressing the button with his 
thumb. He was also instructed not to respond to the 
illumination of other lamps. The Ss were given an 
initial series of 35 trials with the center lamp. Test 
trials followed the initial series without interruption 
or notification of the S. During the test series, each 
of the six peripheral lights was presented four times. 
The 24 peripheral-light trials were interspersed 
among 95 presentations of the central lamp. From 
three to seven central-light trials intervened between 
successive presentations of the peripheral stimuli. 
This procedure was accomplished at the rate of 
approximately five or six trials per minute. The 
period between the activation of the ready light and 
the white stimulus lights varied between 3 and 5 
seconds, 

Two measures of SG were employed. The simplest 
was a count of the number of responses made by 
each S to the six peripheral lamps. A second measure 
was developed to provide a score weighting the 
peripheral lights as a direct function of their distance 
from the central standard lamp. Accordingly, each 
Peripheral light was assigned a numerical value—1 
to the two lamps adjacent to the central stimulus, 
2 to the intermediate lamps, and 3 to the most 
distal lamp. 


RESULTS AND DISCUSSION 


Intercorrelations of the several overinclu- 
sion measures are presented in Table 1. It 
will be seen that, for the most part, these 
relationships are strikingly low. In fact, if one 
ignores the built-in correlations between the 
two stimulus-generalization measures and be- 
tween the two Epstein scores, only 2 of the 
remaining 13 7’s are significant at the .05 
level, and neither exceeds .30. These 7’s ap- 
pear between Epstein’s underinclusion and the 
two SG measures, Rather than being negative, 


as had been anticipated, both were positive; 
this suggests that SG, rather than being re- 
lated to overinclusion, may be weakly associ- 
ated with poverty or concreteness of concept 
formation, 

It is also interesting to note that the 10 
rs relating measures previously used or pro- 
posed as overinclusion definitions (Epstein 
overinclusion, proverbs word count, object- 
sorting score, and the two SG variables) were 
all nonsignificant. Unless the reliabilities of 
these measures are extremely low, they obvi- 
ously reflect quite different variables and ap- 
pear to explain the existence of a highly 
contradictory set of findings on the correlates 
of overinclusion. 

The present results can be interpreted as 
unfavorable validity data for the several mea- 
sures as a whole. Apparently, further research 
will be needed to establish which, if any, of 
these measures actually reflect the clinical 
phenomenon of overinclusion. However, as is 
the case in any such validational investiga- 
tions, a second hypothesis—that clinical over- 
inclusion is, in fact, multidimensional—should 
also be considered in view of the present 
findings. 

The results also throw doubt on the ra- 
tionale for Payne’s practice of summing con- 
verted scores based on a proverbs-test word 
count and number of object-sorting items 
selected as a partial definition of overinclusion 
(Payne & Friedlander, 1962; Payne et al., 
1963). Since the 7’s between these variables 
are very low (.007 in the present paper and 
—.030 in a population studied by Hawks, 
1964), it would seem that the practice is 
without empirical justification. 
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MULTIVARIATE ANALYSES OF PARENTS’ MMPIs BASED ON 
THE PSYCHIATRIC DIAGNOSES OF THEIR CHILDREN * 


WILLIAM D, WOLKING, GEORGE H. DUNTEMAN, anp JOHN P, BAILEY, Jr. 
Health Center, University of Florida 


The specific goal of the study is to determine if the mean MMPI profiles of 
the parents of 6 child diagnostic groups are significantly different from each 
other. The parents were grouped according to the diagnosis of their children. 
Separate analyses were made for the mothers of male children, mothers of 
female children, fathers of male children, and fathers of female children, An 
overall multivariate test of the differences in the parents’ mean MMPI profiles 
for each of these samples indicated significant differences in parent profiles 
only for the mothers of male children. However, the significant difference for 
this analysis was not of much practical significance, 


Many attempts have been made to relate 
home atmosphere and parental attitudes and 
behavior to the personality and behavior of 
children (Baldwin, Kalhorn, & Breese, 1945; 
Becker, Peterson, Hellmer, Shoemaker, & 
Quay, 1959; Burchinal, 1958; Schaefer & 
Bayley, 1963; Shoben, 1949; Slater, 1962). 
Clinicians have been especially interested in 
establishing useful predictive relationships 
between the behavior and feelings of deviant 
children and the attitudes, traits, and behav- 
iors of their parents, As the research relating 
Parental attitudes to child behavior proved 
Somewhat disappointing, there seemed to de- 
velop the hope that higher and more stable 
relationships would be found between parental 
Personality traits and child personality and 
behavior. The MMPI has frequently been 
used to measure the parental traits, and it has 
been established that the parents of child 
clinic patients are reliably more deviant on 
several MMPI scales than parents in general 
(Goodstein & Rowley, 1961; Hanvik & By- 
Tum, 1959; Lauterbach, London, & Bryan, 
1961; Liverant, 1959; Marks, 1961; Wol- 
king, Quast, & Lawton, 1966), although these 
differences are typically small and of dubious 
Psychological meaning. Attempts to relate 
Parent MMPI profiles and child symptoms 
and behavior patterns have provided a few 


1 The participation of the second and third authors 
Was supported by Grant RD-1127 to the Regional 
Rehabilitation Research Institute, University of Flor- 
ida, from the Vocational Rehabilitation Administra- 
tion, Department of Health, Education, and Welfare, 
Washington, DiC; 


leads, but have not generally been encourag- 
ing (Hanvik & Byrum, 1959; Wolking, et 
al., 1966). 

The present study was conceived in the 
hope that the use of a relatively large sample, 
many carefully diagnosed criterion groups of 
children, and multivariate analysis of vari- 
ance might bring out findings that had been 
latent in previous research. The specific goal 
of the study is to determine if the mean 
MMPI profiles of the parents of several child 
diagnostic groups are significantly different 
from each other. 


METHOD 


The MMPI profiles were obtained from the 
mothers and fathers of patients in the Division of 
Child Psychiatry of the University of Minnesota 
Medical Center, Considered in this study were six 
psychiatric diagnostic subgroups of male children and 
five psychiatric diagnostic subgroups of female chil- 
dren. All of the children’s psychiatric diagnoses were 
made by the same psychiatrist using an explicit and 
consistent set of diagnostic rules, The K-corrected 
MMPI profiles (10 basic clinical scales plus the 
L, F, and K scales) were obtained for the mother 
and father of each child. The mothers’ and fathers’ 
profiles were then divided into separate groups on 
the basis of the psychiatric diagnosis of their male 
or female child. To control for the influence of the 
sex of both parent and child, the total sample was 
divided as follows: MMPI profiles of the mothers 
of male children (n =312), fathers of male children 
(n =233), mothers of female children (n= 143), 
and fathers of female children (n = 105). The MMPI 
profiles for the mothers and fathers were each 
divided into six subgroups on the basis of the fol- 
lowing diagnostic groups of male children: organic 
brain syndrome (m = 41, mother and n = 36, father), 
psychosis (n= 19 and 17), conversion reaction (n = 
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Fic. 1. Mean MMPI profiles for the mothers of female children in five diagnostic groups. 


29 and 20), anxiety reaction (n= 76 and 57), be- 
havior disorder (n=79 and 54), and mental de- 
ficiency (n= 68 and 49). Likewise, the MMPI pro- 
files of the mothers and fathers were each divided 
into five subgroups on the basis of the following 
diagnostic groups of female children: psychosis 
(n=12 and 12), conversion reaction (n=39 and 
27), anxiety reaction (n= 30 and 23), behavior dis- 
order (n= 27 and 17), and mental deficiency (n= 35 
and 26). 

An overall multivariate test of the differences in 
the parents’ mean MMPI profiles for the different 
children’s diagnostic subgroups for each of the four 
samples was computed. The test was based on an F 
approximation for testing the significance of Wilks’ 
lambda (Cooley & Lohnes, 1962). 


RESULTS AND Discussion 


The overall multivariate test of no differ- 
ences in mean MMPI profiles among the 
parents of different diagnostic groups was ac- 
cepted at the .05 level of significance for the 
mothers of female children, fathers of female 


children, and fathers of male children and was 
rejected at the .025 level for the mothers of 
male children. The mean MMPI profiles for 
each of the four analyses are presented in 
Figures 1 through 4. Also indicated in each 
figure are the significant levels for the F tests 
of the individual scales. 

The significant Fs for the second and third 
analyses are presented for the reader’s in- 
formation but will not be discussed because 
the overall profiles were not significantly 
different from each other. 

The highly significant F for the Si scale 
on the last analysis indicated that this scale 
contributed the most to overall group separa- 
tion. This was verified by the size of the Si 
scale’s coefficients on the two largest discrimi- 
nant functions. 

The practical significance of these profile 
differences for the last analysis was assessed 


Fic. 2, Mean MMPI profiles for the fathers of female children in five diagnostic groups. 
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Fic. 3. Mean MMPI profiles for the fathers of male children in six diagnostic groups. 


by examining how these same mothers could 
be classified into the six respective diagnostic 
categories on the basis of their discriminant 
scores on the five discriminant functions. The 
classification matrix is presented in Table 1. 
The rows indicate actual diagnostic categories, 
while the columns indicate the predicted cate- 
gories. With the possible exception of the 
second and third categories, the table indicated 
that the group differences are of little prac- 
tical significance. Moreover, we would expect 
a drop in the number of correct classifications 
for the second and third categories on a cross- 
validation sample. 

These results, with the MMPI findings 
cited in the first paragraph, indicate that al- 
though significant MMPI profile differences 
Sometimes exist among various parent groups, 
the magnitude of the differences is too small 


TABLE 1 


CLASSIFICATION OF MOTHERS OF MALES IN THE SIX 
DIAGNOSTIC CATEGORIES 


Predicted group 


Actual 

Fie OBS | PSY | CVN | ANX | BD | MD 
OBS 13 0 4 5 11 8 

PSY 1 9 1 1 5 2 

CVN 2 0 16 3 4 3 

ANX 8 5 go 21 17 18 

BD 8 9 10 9 30 13 

MD 3 6 12 14 19 14 


Note.—Abbreviated: OBS = organic brain syndrome, PSY 
= psychosis, CVN = conversion neurosis, ANX = anxiety 
BD = behavior disorder, and MD = mental deficiency. 


to be of help to the clinician in distinguishing 


between parents of normal children and clinic- 
patient children or among the parents of sev- 
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Fic. 4, Mean MMPI profiles for the mothers of male children in six diagnostic groups. 
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eral diagnostic groups of clinic children. The 

MMPI may not be measuring the relevant 

aspects of child-parent relationships or behav- 
jor. 


REFERENCES 


Batpwiy, A. L., Katuorn, J., & Breese, F. H. Pat- 
terns of parent behavior. Psychological Mono- 
graphs, 1945, 58(3, Whole No. 268). 

Becker, W. C., Pererson, D. R, Hetrmer, L. A., 
SHOEMAKER, D. J, & Quay, H. C. Factors in 
parental behavior and personality as related to 
problem behavior in children. Journal of Consult- 
ing Psychology, 1959, 23, 107-118. 

Burcuinat, L. G. Parents attitudes and the adjust- 
ment of children. Journal of Genetic Psychology, 
1958, 92, 69-79. 

Coorey, W. W., & Lomnes, P. R. Multivariate pro- 
cedures for the behavioral sciences. New York: 
Wiley, 1962. 

GoopstEIN, L. D., & Rowrey, V. N. A further study 
of MMPI differences between the parents of dis- 
turbed and nondisturbed children. University of 
Iowa, 1961. (Mimeo) 

Hanvix, L. J., & Byrum, M. MMPI profiles of par- 
ents of child psychiatric patients. Journal of Clini- 
cal Psychology, 1959, 15, 427—431. 


W. D. Workec, G. H. DUNTEMAN, AND J. P. BAILEY, JR. 


LAUTERBACH, C., Lonpon, P., & Bryan, J. MMPI’s of 
parents of child guidance cases. Journal of Clinical 
Psychology, 1961, 17, 151-154. 

Liverant, S. MMPI differences between parents of 
disturbed and nondisturbed children. Journal of 
Consulting Psychology, 1959, 23, 256-260. 

Marks, P. A. An assessment of the diagnostic proc- 
ess in a child guidance setting. Psychological Mono- 
graphs, 1961, 75(3, Whole No. 507). 

SCHAEFER, E. S., & Bayrey, N. Maternal behavior, 
child behavior, and their intercorrelations from 
infancy through adolescence. Monographs of the 
Society for Research in Child Development, 1963, 
28(3, Whole No. 87). 

Sxosen, E. J. The assessment of parental attitude in 
relation to child adjustment. Genetic Psychology 
Monographs, 1949, 39, 101-148. 

SLATER, P. E. Parental behavior and the personality 
of the child. Journal of Genetic Psychology, 1962, 
101, 53-68. 

Worxmc, W. D., Quast, W., & Lawton, J. J. 
MMPI profiles of the parents of behaviorally dis- 
turbed children and parents from the general pop- 
ulation. Journal of Clinical Psychology, 1966, 22, 
39-48. 


(Received December 22, 1966) 


Journal of Consulting Psychology 
1967, Vol. 31, No. 5, 525-528 


“OPENNESS” AS A DIMENSION OF PROJECTIVE 
TEST RESPONSES 


J. HERBERT HAMSHER anp AMERIGO FARINA 


University of Connecticut 


This study attempts to assess the degree to which conscious motivation affects 
Ss’ “openness” on projective tests. Also examined is the influence of stimulus 
variation. Using role-playing instructions, 31 undergraduate Ss were directed 
to tell personally revealing stories to 6 TAT cards; 29 additional Ss were 
told to write “guarded” stories. Ratings of the stories by judges using a 
manual of openness indicated that Ss can successfully control this facet of TAT 
performance and that judges can objectively evaluate an S’s intent. Degree of 
assessed openness is shown to be related to differences in TAT cards and also 
to the sex of the respondent. Implications for use of projective techniques and 
for further research are discussed, as well as the significance of a subgroup 
of Ss who, while instructed to be guarded, wrote stories rated as open. 


Despite the extensive use of inferences 
drawn from projective protocols, it is obvious 
that many basic questions concerning test 
productions have yet to be answered. One of 
the areas of ambiguity revolves around the 
extent of control subjects have over what and 
how much they reveal. For example, Frank 
(1939), in his classic article, conceptualized 
projective tests as “obtaining from the sub- 
ject what he cannot or will not say [p. 403].” 
Frank presumed that such revelation required 
neither awareness nor conscious motivation. 
More recently, however, Murstein (1965) has 
observed, “The normal individual has proved 
in subsequent research unusually able to pro- 
tect his ‘private world’ from manifesting itself 
on projective techniques [p. 1].” Masling 
(1960) and Murstein (1963) have reviewed 
the experimental evidence in support of this 
More realistic view. 

The present investigation is an attempt to 
extend our understanding of the degree and 
nature of subject control over how much per- 
Sonal material he communicates through The- 
matic Apperception Test (TAT) protocols. 
Specifically, it focuses on the conscious con- 
trol associated with motivation to reveal or 
to conceal and, further, explores the role of 
stimulus variation and individual differences in 
this control. A related purpose of this study 
is also the development of an objective scale 
for the assessment of the degree to which a 
Protocol indicates self-revelation or self-con- 
cealment. 


METHOD 
Subjects 


A total of 61 Ss enrolled in an undergraduate 
course in personality underwent the experimental 
procedure during their regular class meeting. The Ss 
were in two sections which were taught by the same 
instructor and which differed only in that one met 
in the morning and one met in the afternoon. As- 
signment of one group to the “open” condition and 
the other to the “guarded” condition (described 
below) was on a chance basis. Class standing ranged 
from sophomore to senior, and age varied from 19 
to 37; the two groups were essentially equivalent in 
these respects, Fifteen females and 14 males consti- 
tuted the open group, and 17 females and 15 males 
constituted the guarded group. 


Cards 


The projective test administered was an abbrevi- 
ated TAT. From the standard TAT series, Cards 1, 
2, 4, 3BM, 13MF, and 16 were selected as repre- 
sentative and as depicting a variety of situations 
potentially related to the concerns, problems, and 
anxieties of a college population, These cards also 
provide a wide range of stimulus structure; they 
vary from Card 1, which unambiguously depicts a 
young boy staring at a violin, to Card 16, which is 
completely blank. Black and white transparencies 
were made from the cards and projected in the 
semidarkened classroom. The screen image was ap- 
proximately a 5-foot square and could easily be seen 
by all Ss. 


Administration and Instructions 


The test was administered to both classes on the 
same day; in each class the senior author was in- 
troduced by the instructor as a research psychologist 
working with the NIMH. The research was ex- 
plained to each group as an investigation into the 
use of stories told to pictures in uncovering anxie- 
ties and conflicts. The open group was requested to 
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TABLE 1 TABLE 2 
SUMMARY OF ANALYSIS OF VARIANCE: MEANS AND STANDARD DEVIATIONS OF OPENNESS 
Story RATINGS Ratincs or STORIES AND PROTOCOLS 
Source df MS F Card 

Groce praca 

Groups 1 | 183.93 | 44.97** 1 2 4 |3BM]13MF| 16 

Ss in groups 56 4.09 Open 

Cards 5 1.73 2.79* M 3.79 | 3.62 | 3.73 | 4.14 | 3,76 | 4.55 | 4.0 

Cards X Groups 5 172 | 2.77 oiea | 101 | 1:07 | 1.02 | 96 | 1.22 83 | 1.05 

Ss in Groups X Cards | 280 62 M 2.55 | 2.55 | 2.41 | 2.41 | 2.55 | 2.50 | 2.41 
SD 1.14 | 97| 97 | 1.28 | 1.20 | 1:18 | 1.21 

Total 347 = — 
*p <05. orated, with some involvement with plot develop- 
* p <01. ment. Some indications of reservation, holding back. 


role play as students applying to a psychological 
clinic for assistance and to assume that they were 
attempting to reveal as much as possible about them- 
selves through their stories. It was explained to the 
guarded group that people with problems sometimes 
want help but still find it difficult to reveal them; 
they were asked to assume this role. Both groups 
were assured that although their problems might 
not be of sufficient magnitude actually to bring them 
to a psychology clinic, the task would be more 
realistic if they would concentrate on revealing, or 
guarding against revealing, their own personal con- 
flicts and anxieties. They were then given standard 
TAT instructions. 

Each slide was introduced by E as follows: “Re- 
member, you are concentrating on your own prob- 
lems and trying to [not to] communicate your con- 
cern through your stories.” Before Card 16, the 
following additional instruction was given: “The 
next one is somewhat different; it is blank and you 
are first to imagine a scene and then tell a story 
about it, just as you have with the others.” 


Rating Degree of Openness 


Openness was used in a way consistent with its 
generally understood connotation of free, unre- 
stricted communication of personal information; it 
was operationally defined by a rating from 1 through 
5, based on a manual developed in pilot work The 
following summarizes the story characteristics associ- 
ated with the five scale points: 

1, Rejection or mere description of the card; 
stories stilted, stereotyped, unelaborated to marked 
degree. 

2. Above features to more limited degree; infre- 
quent suggestions of a desire to communicate. 

3. Equal proportions of indicators of openness 
and guardedness. 

4. Apparent identification with story characters 
and personalization of theme. Stories freely elab- 


1 The manual used in this study is currently being 
revised and extended but is available upon request. 
The authors wish to express their gratitude to Rob- 
ert R. Dies for his extensive contribution toward the 
manual. 


5. Stories clearly personalized and fully elaborated; 
lack previous defensive signs, 

The manual also contained instructions for inte- 
grating ratings and individual stories into an overall 
protocol rating. 


Reliability 


Interrater reliability was assessed by randomly 
selecting protocols and comparing ratings between 
the authors and with two additional judges.2 Ken- 
dall tau coefficients for the protocol ratings of one 
set of 10 Ss ranged from .508 to 847, all significant 
beyond the .02 level of confidence. (While Kendall’s 
tau was considered most appropriate for this type of 
rating scale, Spearman rank-order and Pearson 
product-moment correlations were computed to pro- 
vide a comparison with other studies reporting these 
statistics. The corresponding values of rho, corrected 
for ties, ranged from .605 to 907, all significant 
beyond the .05 level. The associated values of r varied 
from .504 to .879 and were significant beyond the 
.01 level.) The highest reliability was obtained with 
a judge who was totally uninformed of the experi- 
mental design or purpose and rated on the basis of 
the manual alone. Additional samples of 15 and 25 
protocols yielded reliability coefficients of essentially 
the same magnitude; all were significant at the .05 
or .01 confidence level. When correlations were com- 
puted for ratings of individual stories, reliabilities 
were slightly higher. 


RESULTS 


From the summary of the analysis of vari- 
ance presented in Table 1, it can be seen that 
significant differences resulted from the ex- 
perimental manipulation and the variation in 
cards, as well as from the interaction of the 
two variables. Data for this analysis consisted 
of the individual ratings given to each story 
for each subject. 


2 Appreciation is expressed to Herbert J. Cross and 
Robert R. Dies, who served as judges for these 
analyses. 


“OPENNESS” As A DIMENSION OF Projective Test RESPONSES 


TABLE 3 
AVERAGE PROTOCOL RATINGS BY SEX AND GROUP 


Group 
Sex 
Guarded Open 
Male 3.08 3,9» 
Female 1,9> 41° 
an = 14, 
bn = 15, 
on = 17. 


The relationships of the variables are 
further elucidated by Table 2, which gives 
means and standard deviations by groups and 
cards; differences associated both with cards 
and with the interaction of cards and groups 
are explicable in terms of the open group. 
Stories of open subjects were consistently 
given higher (more open) ratings, but these 
subjects varied appreciably more in their re- 
sponses to the different cards than did guarded 
subjects. The difference was most apparent 
with the blank card, No. 16; open subjects 
told stories which were rated considerably 
more open, but the stories for this card told 
by the guarded subjects were rated no dif- 
ferently from those elicited by the other cards. 

The design of the study limited the analy- 
sis of differences in openness associated with 
subject characteristics to the variability be- 
tween males and females; a comparison of 
means is presented in Table 3. The average 
ratings assigned to stories of males and fe- 
males in the two groups revealed that open 
and guarded females differed more than did 
the two groups of males; the means of the 
two male groups are not significantly different 
when compared by ¢ test, but the female com- 
parison yields a ¢ of 7.05, significant at the 
01 level. While the protocols of both sexes 
received essentially the same ratings in the 
open group, females’ stories were rated as 
being more guarded when subjects were given 
guarded instructions; only the male-female 
comparison in the guarded group is significant 
by ¢ test (ż = 3.10, p < 01). 

A more global indication both of the suc- 
cess of the subjects’ role-playing attempts and 
of the validity of the ratings is the fact that 
among 32 subjects in the open condition, the 
Protocol ratings of 25 were either 4 or 55 
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among 29 in the guarded group, 20 protocols 
were rated 1 or 2. Four in the open group and 
2 in the guarded group were essentially un- 
classified, having received midscale ratings of 
3. Misclassifications were as follows: 10% 
(3) in the open group received ratings of 1 
or 2, while 24% (7) in the guarded group 
received ratings of 4 or 5. Sex differences are 
reflected here also; 6 of the 7 misidenti- 
fied in the guarded group were males. These 
protocols were rerated by both authors, and 
it was concluded that the ratings were not in 
error but were accurate assessments of the 
degree of openness reflected in the protocols. 
These protocols dealt with ostensibly highly 
personalized and anxiety-provoking situations 
markedly different from the typical guarded 
ones. The failure of these subjects to comply 
adequately with the experimental instructions 
raises some intriguing questions which will be 
discussed. 


DISCUSSION 


The results of this study are interpreted as 
raising significant questions about a major 
assumption underlying projective techniques. 
Support is given to the growing realization 
that the subject is not the passive, inert part- 
ner in this enterprise that he has sometimes 
been considered. How he construes the test’s 
purpose and how successful he wants the test 
to be are powerful determinants of the nature 
of his protocol. We are led to join those who 
conceptualize testing situations as personal 
interactions similar to other social encounters; 
if the subject responds for some reason by 
wanting the examiner to know of his personal 
and private conflicts and anxieties, certainly 
more will be revealed than if he finds the test, 
the examiner, and the situation somewhat 
mysterious, formidable, and threatening. 

The potential utility of the demonstration 
that subjects can exercise conscious control 
over the openness of their TAT protocols is 
enhanced by the discovery that degree of 
self-revelation is subject to relatively reliable 
assessment. The magnitude of interrater reli- 
ability coefficients strongly suggests that vari- 
ations in openness can be objectively evalu- 
ated, but simultaneously makes clear that all 
problems in that regard have not yet been 
solved. The fact that judges reported having 
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to rely at times on intuitive reactions, being 
unable to make distinctions solely on the basis 
of factors listed in the manual, has emphasized 
the need for refinement of the rating cate- 
gories. 

Particularly when a subject is attempting 
to communicate openly, TAT cards appear to 
differ appreciably in the extent to which they 
can be used for personal revelation. The ob- 
served intrasubject variability in combination 
with intersubject variability leads to the hy- 
pothesis that openness on any particular card 
is a function of objective characteristics of the 
card, as well as of attributes of the examinee. 
Which specific cards should prove of greatest 
general utility would, in all probability, de- 
pend to a considerable extent on the popula- 
tion of which the examinee is a member, 
Further research is needed to study the inter- 
action of a variety of stimulus and subject 
characteristics in the production of open pro- 
tocols. That Card 16 (blank) permitted the 
greatest degree of openness with these sub- 
jects is presumably a reflection both of the 
intellectual level and verbal fluency of college 
students and of the absence of structure in the 
stimulus. The appearance of sex differences in 
our data underscores the complexity of the 
projective test situation. Whether this result 
is a peculiarity of our sample or whether fe- 
males do, in actuality, have a greater latitude 
of openness under their conscious control 
must, for the present, remain a matter of con- 
jecture. At this point, however, it appears 
that degree of self-revelation is some joint 
function of, at least, subject intent, stimulus 
material, and sex. 

The 10 subjects whose protocols were rated 
in the opposite direction from the group to 
which they belonged represent an interesting 
subset of the experimental sample. The open 
stories told by subjects instructed to be 
guarded definitely suggest a high degree of 
emotional involvement, dealing frankly with 
conflicts related to sex and to academic prob- 
lems, Assuring subjects of anonymity ren- 
dered it impossible to collect more data on the 
personalities and adjustment of these indi- 
viduals. It is possible, however, that they pro- 
vide vindication for Frank’s (1939) position; 
that is, they appear troubled to a degree 
which prompts them, “despite themselves,” to 


communicate openly when given an oppor- 
tunity. If we accept this reasoning, it may 
also be true that these subjects bear a greater 
resemblance to a patient population than do 
the complying subjects; this possibility must 
be submitted to further empirical investiga- 
tion. The three subjects in the open condition 
who were judged to be guarded present a 
simpler analytic task: One seemed unin- 
volved in the task and attempted wittiness in 
all his stories; the other two reflected rather 
bland emotion and possible limitations of 
personal insight, motivation, verbal facility, 
and/or simple willingness to cooperate. 

The results of this study should not be 
construed as reflecting adversely on the use 
of techniques such as the TAT. Rather, it 
should be understood that the extent of per- 
sonal information revealed in a projective 
protocol results from a complex interaction 
of a number of factors, important among 
which is the individual’s desire to communi- 
cate with the examiner. Whether the effects 
reported here are generalizable to other projec- 
tive tests, such as the Rorschach, is a ques- 
tion which must be answered empirically. 
Such data as those of Henry and Rotter 
(1956) and the fact that the present study 
deals essentially with an assumption basic to 
all projective techniques would suggest that 
self-revelation on the Rorschach might also 
be subject to conscious control, 

Further research being conducted is con- 
cerned with the desire to communicate as a 
result of therapeutic interaction and with the 
personality characteristics of subjects who are 
unable or unwilling to comply with guarded 
instructions, 
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Ss were exposed to a tape-recorded verbal attack from a fictitious S in the 
last of a series of discussion rounds. A second fictitious S was nonattacking. 
Previously, Ss had been either (a) assigned to work with the attacker on a 
postdiscussion problem-solving task (Group 1), (b) assigned to work with 
the nonattacker (Group 2), or (c) given their choice of partner (Group 3). 
Group 1 showed the least drop in evaluative ratings of the attacker. This 
finding supports a functional defense hypothesis which predicts that Ss without 
an available adaptive response will perceive an attacker more favorably than 
will Ss who have an adaptive response. No evidence was found that Ss 
classified as repressors and sensitizers consistently differed in their ratings. 


The recent interest shown by personality 
researchers in cognitive style has produced a 
variety of dichotomous classifications with 
such diverse labels as_leveling-sharpening 
(Holzman & Gardner, 1959), facilitation- 
inhibition (Ullman, 1962), externalization- 
internalization (Shannon, 1962), and repres- 
sion-sensitization (Altrocchi, Parsons, & 
Dickoff, 1960; Byrne, 1961). This last clas- 
sification has been particularly instrumental 
in stimulating research since the development 
of an MMPI-based scale (R-S scale) which 
accurately and reliably identifies persons as 
repressors or sensitizers (Altrocchi et al., 
1960; Byrne, 1961; Byrne, Barry, & Nelson, 
1963). Repressors are persons who character- 
istically respond to psychological threats with 
avoidance defenses (e.g., perceptual defense), 
whereas sensitizers typically respond to 
threats with approach behavior. 

Most of the previous research has been 
concerned with demonstrating that individual 
differences in repression-sensitization exist. 
Byrne (1964), after an exhaustive review of 
the literature, concluded that repression- 
sensitization is a meaningful dimension of 


1A portion of a dissertation presented to the 
faculty of the Graduate School of the University 
of Kentucky in candidacy for the degree of doctor 
of philosophy. The author is indebted to James C. 
Baxter, who co-directed the research, and to Melvin 
J. Lerner, who served on the special committee, for 
their excellent suggestions and criticisms during all 
stages of research. Special thanks go to Robert 
Straus, of the Behavioral Science Department of the 
University of Kentucky College of Medicine, for 
the use of the Department’s excellent laboratory 


facilities. 


personality, although relatively few people use 
either defensive style to an extreme degree. 
Unfortunately, little information is available 
about the nature of the mechanisms for re- 
pressive and sensitizing defenses, although 
Dulany (1957) has demonstrated that sub- 
jects can be conditioned to use either per- 
ceptual defense or vigilance. Since the two 
defense styles are probably used in different 
situations by most individuals, it is likely that 
environmental factors determine which is 
selected. Two major points of view have been 
offered concerning the conditions for using 
a particular kind of defense. 

Bruner (1957) stressed the adaptive func- 
tion of perception. This position maintains 
that vigilance or sensitizing behavior will 
occur when the situation allows the person 
to make an adaptive response to the threat. 
If a person comes into contact with a threat- 
ening situation which he feels confident he 
can handle, it is to his advantage to recognize 
the threat as quickly as possible so that he 
can do something about it. On the other 
hand, if a person sees no way out of a 
threatening situation, one solution is to deny 
the threat in perception or somehow distort 
one’s perception to make the threat less 
severe. This is perceptual defense. Thus, a 
functional defense hypothesis would predict 
that if an escape route from the source of 
a threat is accessible, knowledge of the ex- 
istence of this escape route should lead to 
sensitizing defenses (vigilance behavior) when 
the threat is introduced. Conversely, the lack 
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of awareness of an escape route should lead 
to the use of repressive defenses. 

An opposing point of view was offered by 
Janis (1962) who postulated that the psycho- 
logical reaction to danger involves two kinds 
of information: (a) clear information about 
the magnitude of the threat of danger and 
(4) information about the effectiveness of 
personal resources for coping with the threat. 
The greatest amount of vigilance behavior 
will occur when the person has much informa- 
tion of the first type and none of the second. 
Specifically, Janis (1962) hypothesized that 
“strength of vigilance tendencies will be in- 
creased by a warning communication or 
physical sign interpreted by the perceiver as 
indicating that a currently available escape 
route will become inaccessible to him once 
the danger materializes [p. 74].” While 
Bruner and Janis agree on the importance of 
perceived information about the outcome for 
choice of defensive style, they disagree about 
its effects on behavior. When the availability 
of an escape route is not anticipated, the 
functional defense hypothesis predicts de- 
creased vigilance behavior, while the Janis 
hypothesis makes the opposite prediction of 
increased vigilance. The present study at- 
tempted to test the relative accuracy of these 
two predictions in an interpersonal situation. 
In addition, the prediction was made that 
persons who have been identified as repres- 
sors and sensitizers by psychometric criteria 
(R-S scale) should exhibit consistent indi- 
vidual differences in response to threat in an 
interpersonal situation. 


METHOD 
Design 


The experiment was conducted in two phases. As- 
signment of Ss to experimental conditions was made 
prior to the first phase or “discussion rounds.” There 
were three conditions in the experiment, but Ss in 
all groups were treated alike in the first phase of 
the experiment proper. 

Condition 1. The Ss in this condition were assigned 
to work with a “threatening partner” (Person A) in 
Phase 2 of the experiment. Since this removed S's 
opportunity to make an adaptive response to the 
threat (avoiding the threatening person), it follows 
from Bruner (1957) that it should encourage the 
use of repressive defenses, that is, denying the 
threatening characteristics of that person, The Ss in 
this condition should rate the threatening person 
less unfavorably than Ss who were not assigned 


to work with him, according to the functional 
defense hypothesis. On the other hand, the Janis 
hypothesis predicts that Ss in this condition should 
be more vigilant of the threat and thus produce 
less favorable ratings of the threatening person than 
Ss not assigned to work with him. 

Condition 2. This condition assigned Ss to work 
with a nonthreatening partner (Person B) in 
Phase 2. Since these Ss could make no instrumental 
avoidance response to their partner, a repressive 
type of defense would be functional but unneces- 
sary since B does not constitute a threat. No specific 
predictions were made for Ss in this condition, 
although it was intended to serve as a control con- 
dition which would reflect individual differences along 
the repression-sensitization dimension in response to 
A and B. 

Condition 3. This condition differed from Condi- 
tions 1 and 2 in that the real Ss were given a choice 
of working with A or B. This choice was designed 
to provide Ss with an effective instrumental avoid- 
ance response to the threat resulting from A’s verbal 
attack, thus reducing the magnitude of the threat. 
The provision of an adaptive response was intended 
to eliminate S’s necessity for denying the threat. 
The functional defense hypothesis predicts that the 
ratings of A will be lower in this condition, com- 
pared with the ratings made by Ss in Condition 1, 
and that the recognition of the threat will occur 
more quickly in this condition than in Condition 1. 
That is, Ss will lower their rating of A sooner in 
the sequence of rounds than Ss in Condition 1. 
The Janis hypothesis, on the other hand, predicts 
that these ratings will be higher in comparison with 
Ss in Condition 1, since there is no need to increase 
vigilance behavior. 


Subjects 


The Ss were 57 male volunteers from the subject 
pool of the introductory psychology course at the 
University of Kentucky. Twelve Ss were eliminated 
from the analysis of the data because they suspected 
the purpose of the experiment or failed to comply 
with some aspect of the experimental procedure. 
Three additional Ss were eliminated in order to 
balance the design, The analysis of the data reported 
here is based on 42 Ss, 14 in each of the three 
experimental conditions. 

All Ss were recruited individually by telephone 
from students who took the R-S scale during one 
of several group testing sessions. Out of 58 Ss who 
were called, only 1 refused to participate. 


Equipment 


The experiment was conducted in four adjacent 
experimental rooms, Three of the rooms were identi- 
cally equipped with a table, chair, microphone, and 
set of earphones, to give the impression that they 
would all be used during the experiment. The fourth 
room served as a master control room from which 
E could observe S’s behavior through a one-way- 
vision mirror, operate a tape recorder, and monitor 
the intercommunication network. 
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Procedure 


When S appeared for the experiment, he was es- 
corted through two experimental rooms into the 
innermost experimental room, given a sheet of in- 
structions, and told that he would hear more detailed 
instructions over the intercommunication system. 
Once settled, S heard tape-recorded instructions 
which described the “purpose” of the experiment in 
detail. The S was told that he would participate in 
a series of discussion rounds with the two other Ss 
and that these discussions were a preliminary to the 
second part of the experiment, which would consist 
of a cooperative problem-solving task. The S had 
two tasks during the discussion rounds, First, he 
was to formulate an impression of the other Ss’ per- 
sonalities as a basis for predicting how well he would 
be able to get along with them in the second part 
of the experiment. Second, he was to give a clear 
description of his personality characteristics relevant 
to the assigned topic. Phase 2 of the experiment, 
the problem-solving task, was not actually conducted 
but was included in the instructions to create in the 
Ss the anticipation of having to confront one of 
the fictitious Ss. 

The experiment consisted of having S (designated 
as Person C) participate in a series of five discussion 
rounds with two fictitious Ss (designated as Persons 
A and B), represented by prerecorded tapes. The 
S was notified by a buzzer when to begin and stop 
talking. In addition, S$ had a typed instruction sheet 
which identified him for purposes of the experiment, 
described the topics for discussion, and outlined the 
order of speaking for each discussion round. The S 
was told to speak only in accordance with the pre- 
determined order and not to ask questions since the 
electrical circuit allowed only one person to speak 
at a time. The topics for the discussion rounds were 
similar to those described by deCharms and Wilkins 
(1963).2 Generally, Ss were asked to describe their 
background, their social self, their “good” and “bad” 
characteristics, and finally to compare themselves 
with one another, During the discussion rounds, 
Person B was characterized as a placid, easygoing, 
and good-natured fellow. Person A, on the other 
hand, emerged as an arrogant, condescending, and 
egotistical fellow who became increasingly obnoxious 
and hostile as the rounds progressed. This behavior 
culminated in a highly abusive verbal attack by A 
on the real S for 1 minute, 45 seconds during the 
last round, The attack started mildly but ended 
with a series of strong attacking phrases, originally 
used by Thibaut and Coules (1952), directed at the 
inconsistency of S’s self-descriptions which the dis- 
cussion rounds were designed to evoke. After the 
discussion rounds, Æ interviewed S to detect any 
suspicion about the nature of the experiment. During 
the interview, S was disabused of all threatening 
aspects of the experiment, told the nature of the 


2 The author is indebted to Richard deCharms for 


i f one 
his helpful comments and the generous loan o 
of his yee for use as a model for the tape used 


in the present investigation. 
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TABLE 1 
Mean Rounp Ratincs oF Persons A AND B 


Condition 


Round 1 2 3 


A B A B A B 


1 3.33 | 4.35 | 3.62 | 4.16 | 3.78 | 3.83 
2 3.21 | 4.43 | 3.71 | 4.33 | 3.27 | 3.95 
3 3.11 | 4.38 | 3.45 | 4.09 | 3.21 | 3.76 
4 2.78 | 4.32 | 2.78 | 4.12 | 2.50 | 3.50 
5 2.24 | 4.43 | 1.95 | 4.44 | 2.13 | 3.91 
Mean | 2.93 | 4.38 | 3.10 | 4.25 | 2.98 | 3.79 


deception, and asked not to discuss the experiment 
with other students. 


Measurement of the Dependent Variables 


The two primary dependent variables in the ex- 
periment were (a) S’s ratings of Persons A and B 
during the discussion rounds (round ratings) and 
(b) S’s ratings of A and B on 20 adjective scales 
after the discussion rounds were completed (final 
ratings). The S made the round ratings every time 
he heard a bell that occurred at approximately 
15-second intervals during the rounds. Altogether, 
each S made a total of 37 ratings, 18 for B and 19 
for A. The difference in number of ratings for A 
and B resulted from the difference in length of the 


. final round. The round ratings were made on five- 


point scales which contained the word “positive” 
at one end, “negative” at the other, and “cannot 
say” in the middle. These ratings were intended to 
elicit a continuing measure of how favorably or 
unfavorably the two fictitious Ss were perceived by 
the real S. The final rating forms consisted of 20 
nine-point scales developed to reflect good-bad 
personality traits. 


RESULTS 


The subjects in each condition were clas- 
sified as repressors or sensitizers by dividing 
them at the median R-S scale score. Analyses 
of variance performed separately for the sub- 
jects classified as repressors and sensitizers 
revealed no significant differences across 
experimental conditions (all Fs < 1.00), indi- 
cating that the subjects in each condition were 
comparable on this dimension. Tables 1 and 2 
show the mean round and final ratings, For 
both sets of ratings, the lowest possible rating 
(unfavorable) was 1.00 and the highest rating 
was 5.00 (favorable). 
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TABLE 2 
Mean Finat Ratincs oF A anD B 
Condition 
Person 
1 2 3 
A 4.19 4.18 4.44 
B 7.53 7.55 7.25 


Perception of Persons A and B 


Analyses of variance revealed significant 
main effects for persons rated in both the 
round ratings (F = 242.45, p< .001) and 
the final ratings (F = 196.43, p < .001), in- 
dicating that A and B were clearly differenti- 
ated by the experimental subjects. These 
findings suggest that the fictitious roles were 
successful in creating the two distinctly dif- 
ferent impressions intended. Individual com- 
parisons among the ordered means of Rounds 
1-5 for Person A by the Newman-Keuls 
procedure (Winer, 1962) indicated that the 
size of the means decreased progressively from 
Round 1 to Round 5 and that the mean for 
Round 1 (3.58) was significantly larger 
(p < .01) than the means for Rounds 4 
(2.69) and 5 (2.11). No differences existed 
in the ratings of B across rounds. The dif- 
ference in ratings for A and B on Round 1 
(1.02), while significant (p < .01), was less 
than one-half the difference between A and B 
on Round 5 (2.09). These comparisons sug- 
gest that A was rated more negatively than B 
from the outset of the communication phase 
and was seen as increasingly more unattrac- 
tive as the rounds progressed. All individual 
comparisons reported were done by the 
Newman-Keuls procedure for ordered means, 
and the accompanying significance levels 
represent two-tailed tests. 


Functional Defense Hypothesis versus Janis 
Hypothesis 


The first comparison of these two hypothe- 
ses was a comparison of the mean ratings in 
the three experimental conditions. As Table 1 
indicates, Person A was rated higher in both 
Conditions 2 (M = 3.10) and 3 (M = 2.98) 
than in Condition 1 (M = 2.93). An analysis 
of variance indicated that the difference be- 
tween these means was not significant (f= 


TABLE 3 


Means oF ROUNDS 1 AND 5 AND DIFFERENCES BETWEEN 
MEANS FOR Person A 


Condition 
Item 1 2 3 
Rı Rs Rı Rs Rı Rs 
Mean 3.33 | 2.24 | 3.62 | 1.95 | 3.78 | 2.13 
Difference 1,09 1.67 1.65 


1.47, p< .02), but the means were in the 
direction predicted by the Janis hypothesis, 

Because of the unanticipated early differen- 
tiation of A and B, a more appropriate test 
of the functional defense hypothesis is pro- 
vided by the amount of change in the percep- 
tion of A between Rounds 1 and 5. If the 
unavailability of an adaptive response was 
successful in encouraging the use of repres- 
sive defenses, the ratings of A should have 
decreased less substantially in Condition 1 
than in Condition 3. Table 3 shows the mean 
ratings of A on Rounds 1 and 5 and the dif- 
ference between the means for the three ex- 
perimental conditions. The differences be- 
tween the means were almost identical in 
Conditions 2 and 3 (1.67 and 1.65), but the 
difference was considerably smaller in Condi- 
tion 1 (1.09). Individual comparisons among 
these mean difference scores by the Newman- 
Keuls procedure revealed that the difference 
between Conditions 1 and 3 barely missed 
significance at the .05 level (a value of .57 
is required for significance, and the obtained 
value was .54). However, a Mann-Whitney U 
test (Walker & Lev, 1953) indicated that 
the difference between these two conditions 
was significant at the .02 level. Although this 
result appears to support the functional de- 
fense hypothesis, it must be interpreted with 
caution. Since the Round 1 mean was smaller 
in Condition 1 (3.33) than in Condition 3 
(3.78), the greater shrinkage in Condition 3 
may have been due to the initially higher 
rating in this condition. 

Probably the most sensitive test of the 
functional defense hypothesis and the Janis 
hypothesis was provided by the ratings made 
during the final round, which contained the 
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attack. Since the attack was confined to this 
round and gradually increased in intensity, 
the ratings made during this period should 
reflect the subject’s maximal use of defenses. 
Table 4 shows the mean ratings made during 
the final (attack) round. An analysis of the 
ratings made during this round for Person A 
indicated that the main effect for Rating 
Points was significant (F = 16.87, p < .001). 
Since the Experimental Conditions x Rating 
Points interaction was also significant 
(F = 4,06, p < .001), individual comparisons 
were made among the ordered means for the 
seven rating points within each condition. 
These comparisons showed that the subjects 
in Condition 1 did not lower their ratings of 
A significantly (i.e. at the .01 level) until 
the fourth rating point was reached. 
However, the subjects in Condition 3 low- 
ered their ratings of A almost immediately. 
For this condition, the mean for the 
second rating point (2.12) was significantly 
lower (p< .01) than the mean for the 
first rating (3.19). An inspection of the means 
for each rating point revealed that the means 
in Condition 1 diminished less rapidly than 
the means in Condition 3. Individual compari- 
sons of the difference between the means for 
Rating Points 1 and 7 indicated that the dif- 
ference in Condition 1 (.95) was significantly 
smaller (p < .01) than the difference in Con- 
dition 3 (1.83), suggesting that repressive 
defenses influenced the perception of the sub- 
jects in Condition 1. These results support 
the functional defense hypothesis. 


Individual Differences in Repression- 
Sensitization 


Contrary to expectations, none of the re- 
sults reported so far offered any support for 
the individual difference hypothesis. Analyses 
of variance indicated that repressors and 
sensitizers did not differ significantly in either 
their round or final ratings. However, an 
analysis of the ratings made during the 
final round showed a significant interaction 
for Experimental Conditions X Repressors- 
Sensitizers X Rating Points (F = 4.00, p< 
.001). Individual comparisons between the 
means for repressors and sensitizers revealed 
that the sensitizers’ mean rating for Person A 
in Condition 1 (2.40) was significantly higher 
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TABLE 4 


Mean RATINGS OF PERSON A DURING THE ATTACK 
Round 


Condition| RP; | RP: | RP; | RP, | RP; | RPs | RP; 


1 2.82 
2 2.21 
3 3.19 


2.71 | 2.43 
2.14 | 1.79 
2.12 | 2.33 


1.93 
1.79 
2.41 


1.86 | 2.21 | 1.87 
1,76 | 1.90 | 1.86 
1.86 | 1.64 | 1.36 


Note.—RP designates the rating points occurring at 15- 
second intervals during the attack. 


(p< .01) than the mean rating for A by 
repressors (2.12). No significant differences 
in the mean ratings of A by repressors and 
sensitizers existed in the other experimental 
conditions. These results indicate that repres- 
sors and sensitizers were differentiated in the 
high-threat condition, but the results were 
in the opposite direction to those expected 
on the basis of previous research. 

Since the subjects were classified as repres- 
sors and sensitizers on the basis of a division 
at the median score in each condition, it can 
be argued that many of the subjects were 
not different enough in the R-S scale scores 
to be considered “legitimate” repressors or 
sensitizers. In view of this, analyses of vari- 
ance were performed on the round ratings 
and final ratings of the most extreme sensi- 
tizers and repressors in each condition (n = 6 
in each condition). For these subjects, scores 
on the R-S scale ranged 10-108, which is 
comparable to the norms for college males 
(Byrne et al., 1963). The mean R-S scale 
score for repressors was 14, approximately 
1.5 standard deviations below the mean for 
college males (M = 42.45). The sensitizers’ 
mean was 70.66, which is about 1.5 standard 
deviations above the mean for college men. 
The results were consistent with the findings 
based on the data from the total sample. No 
significant differences were obtained for re- 
pressors and sensitizers on either the round 
ratings or final ratings. An analysis of the 
ratings made during the final round also failed 
to reveal any differences between the subjects 
classified as extreme repressors and sensitizers. 

As noted earlier, Condition 2 was intended 
to serve as a control condition which would 
reflect individual differences in response to 
threat by repressors and sensitizers. Since the 
Conditions X Repressors-Sensitizers X Per- 
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sons interaction was significant (F = 3.39, 
$ < .05) for the round ratings, individual 
comparisons were made between the mean 
ratings of A for repressors and sensitizers 
within Condition 2. The mean rating of A 
by repressors (3.23) did not differ signifi- 
cantly from that of sensitizers (2.97), For 
the final ratings, the mean rating of A was 
3.68 for repressors and 4.67 for sensitizers. 
This difference was not significant (p < .10) 
and was in the opposite direction from the 
predicted outcome. These findings provide 
additional evidence that individual differences 
in repression-sensitization were not obtained 
in the present investigation. 


Discussion 


To protect against the criticism that the 
threat in laboratory experiments is frequently 
not sufficient to evoke defensive behavior 
(Madison, 1961), the present study used a 
threat previously demonstrated to be strong 
enough to evoke hostility (deCharms & Wil- 
kins, 1963). Indeed, the present results sug- 
gested that the amount of threat necessary to 
arouse defenses may have been overestimated, 
thereby limiting the sensitivity of the experi- 
mental design. Since A and B were discrimi- 
nated in Round 1, it was not possible to test 
the prediction involving differential recogni- 
tion of the threat in the experimental condi- 
tions. However, during the attack round, the 
subjects in Condition 1 lowered their rating 
of A less quickly than the subjects in Condi- 
tion 3. Since only that short period of time 
which contained the attack could be consid- 
ered as an unequivocal threat to the subjects, 
the ratings made during the attack provide 
the most salient data. These results are con- 
sistent with the functional defense hypothesis. 

Although most of the results did not reveal 
any consistent differences in defensive be- 
havior as a function of individual differences 
in repression-sensitization, one difference was 
found. The sensitizers in Condition 1 rated A 
higher than repressors during the attack 
round. This finding is contradictory to the 
individual difference hypothesis since it indi- 
cates that sensitizers used repressive defenses 
in a high-threat situation while repressors did 
not. Neither the discrepancy in the character- 
ization of A and B nor the magnitude of the 
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threat can be held accountable for this re- 
versal from prediction. Although the use of a 
median split can be interpreted as being un- 
favorable for establishing clear-cut differences 
in repression-sensitization, the fact remains 
that no significant differences were obtained 
when the results were analyzed in terms of 
extreme groups. However, the subsample size 
was so small that it is difficult to make any 
general conclusions. 

Since the threat was of sufficient strength 
and the use of aggression was an appropriate 
stimulus for eliciting defensive behavior 
(Tempone, 1963), the failure of the present 
study to demonstrate consistent individual 
differences in a complex interpersonal situa- 
tion points to the need for more relevant 
theoretical models to explain repression-sensi- 
tization. The almost completely negative find- 
ings with regard to individual differences in 
repression-sensitization suggest that it may be 
too soon to consider this concept as a “mean- 
ingful personality dimension.” In this con- 
text, it is interesting to note that several other 
recent studies concerned with defensive be- 
havior of the repression-sensitization variety 
have used above-threshold threatening stim- 
uli (Altrocchi, Schrauger, & McLeod, 1964; 
Speisman, Lazarus, Mordkoff, & Davison, 
1964) and have been only minimally suc- 
cessful in finding consistent individual dif- 
ferences. The results considered together sug- 
gest the need for additional research on re- 
pression-sensitization in the interpersonal area, 
research designed to develop more adequate 
concepts. It may be that this dimension has 
outgrown its tools in the perceptual defense 
area. 
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NOTES AND COMMENTS 


PERSONALITY CHARACTERISTICS OF YOUNG MALE 
NARCOTIC ADDICTS + 


JEANNE G. GILBERT anp DONALD N. LOMBARDI2 
Mount Carmel Guild, Newark, New Jersey 


A comparison was made of the personality characteristics, 


as measured by the 


MMPI, of 45 young male narcotic addicts and 45 nonaddicted males of similar 
socioeconomic level. Although some maladjustment existed in both groups, 
Tesults suggest deep-seated and widespread pathology among the addicts, Out- 
standing are the addict’s psychopathic traits, his depression, tension, insecurity, 


and feelings of inadequacy, 
interpersonal relationships. 


and his difficulty in forming warm and lasting 
Most addicts seem to be suffering from a basic 


character disorder, although many also have associated psychoneurotic or psy- 


chotic traits. 


The purpose of this study is to investigate the 
personality characteristics of young male nar- 
cotic addicts. There have been many statements 
concerning the personality characteristics of nar- 
cotic addicts, most of which agree that they have 
either weak or disturbed personalities (Ausubel, 
1961; Chein, Gerard, Lee, & Rosenfeld, 1964; 
Gerard & Kornetsky, 1954; Laskowitz, 1961; 
Savitt, 1963; Vogel, Isbell, & Chapman, 1948; 
Wakefield, 1963). However, most of these evalu- 
ations have been based on case histories or per- 
Sonal interviews with apparently no attempt to 
ae use of any standardized tests of person- 
ality. 

Hill, Haertsen, and Glasser (1960) used the 
Minnesota Multiphasic Personality Inventory 
(MMPI) to study the Personality characteristics 
of 270 adult, adolescent, Negro, and white nar- 
cotic addicts who were undergoing treatment at 
the Public Health Service Hospital at Lexington, 
Kentucky. Of the 200 Negro and white male pa- 
tients only 5.5% were classified as normal; 15.5% 
were unclassifiable; 30.5% were classified as 
psychopathic, 19% as neurotic, and 17% as 
schizoid. An additional 12.5% were placed in a 
Psychopathic subgroup on a tentative basis pend- 
ing further study. With the exception of a few 
normals, all patients had elevated scores on the 
Psychopathic Deviate scale and were therefore 
considered conduct disorders. The authors found 
similarity between adolescent and adult narcotic 
addicts and between Negro and white addicts. 
They concluded that “personality characteristics 
of narcotic addicts are either associated with 


1 This study was supported in part by National 
Science Foundation Institutional Grant GU 369. 

2 The authors are also at Fordham University and 
Seton Hall University. 


psychopathy or are predominantly psychopathic 
in nature, although they may include many of 
the classical psychoneurotic and psychotic fea- 
tures [p. 138].” 

In the present study an attempt was made to 
compare the personality characteristics of a group 
of young noninstitutionalized narcotic addicts 
with those of a young nonaddicted group of simi- 
lar socioeconomic level. 


PROCEDURE 


Forty-five male narcotic addicts between the ages 
of 17 and 34 years (average age 22.7 years) were 
compared on the MMPI with 45 nonaddicted males 
between the ages of 16 and 23 years (average age 
18.6 years). The addicts were voluntary participants 
in the narcotic program of the Mount Carmel Guild 
in Newark, New Jersey ; the young men in the con- 
trol group were secured from the Neighborhood 
Youth Corps on a voluntary basis. The two groups 
came from similarly below-average socioeconomic 
levels of society; most were school dropouts who 
had less than a high school education. There were 
unequal numbers of Negro and white Ss (30 white 
and 15 Negro in the experimental group and 22 
white and 23 Negro in the control group), but, as 
the scores of Negroes and whites were practically 
identical, the two races were treated as one group. 
This procedure was also considered defensible in 
view of similar findings by Hill et al. (1960). Mem- 
bers of the experimental group included persons 
using a wide variety of narcotic and stimulant drugs. 

The MMPI (Hathaway & Meehl, 1951; Hathaway 
& McKinley, 1951) was selected as the instrument for 
the investigation of the personality characteristics of 
these young narcotic addicts. It was hoped that by 
comparing the addicts with a group of their non- 
addicted peers of similar socioeconomic background, 
new light might be shed on the basic personality 
make-up of addicts. More individuals of both groups 
were given the MMPI than were used in this study, 
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Notes AND COMMENTS 


TABLE 1 


COMPARISON OF SCORES OF CONTROL AND 
Avpict Groups oN THE MMPI 


Standard 
HEAD deviation 
Scale t 
Control eet Control G 
L 3,68 2.77 1.89 159 |—2.47** 
F 9.37 9.42 | 6.10 5.10 .042 
K 12.28 | 11.40 | 4.82 3.62 .970 
Hs | 13.35 | 14.57 | 4.93 5.45 1.110 
D 18.62 | 26,84 | 3.77 6.69 — 
Hy | 19.95 | 23.07 | 5.40 5.63 2.600** 
Pd | 25.28 | 31.46 | 4.79 4.32 6.438**** 
Mf | 24.93 | 27.20 | 4.92 5.02 2,210* 
Pe'i a 12.17 4.53 3.65 1.230 
Pt | 28.08 | 34.13 | 6.28 6.35 4,548**** 
Sc | 31.84 | 33.28 | 8.89 7.61 820 
Ma | 24.97 | 24.02 | 4.64 4,22 | —1.020 
St 26.08 | 31.62 6.98 8.22 3.441*** 
Es | 43.73 | 43.31 6.70 6.76 |— .290 
A 6.74 | 10.26 | 3.81 4.62 1.18 
SD | 27.97 | 24.91 5.70 6.66 2.13* 


* Because of significant differences in standard deviations, 
the Mann-Whitney U test (Siegel, 1956) was used in this cal- 
culation (U = 2009.5); the difference between the means was 
found to be aminan beyond the .003 level of confidence. 


b <01, 
wert > << .001. 


but all those whose validity scores made the results 
in any way questionable were discarded. 


RESULTS AND Discussion 


Table 1 presents the means, standard devia- 
tions, and t-test scores for the experimental and 
Control groups. 

It can readily be seen that the experimental 
group as a whole scored higher (indicating greater 
maladjustment) than the control group on all but 
one of the diagnostic scales, and the scores on 
this one scale (Ma) were quite similar. The dif- 
ferences were significant beyond the .001 level of 
confidence for the D, Pd, and Pt scales; at the 
01 level for the Si scale; at the .02 level for the 
Hy scale; and at the .05 level for the Mf scale. 
However, it must be noted, and this can best be 
Seen from Figure 1, which shows the composite 
Profiles, that scores on the Si, Hy, and Mf scales 
did not fall above the cutting line of 70 for either 
group. The same holds true of the Z score of the 
validity scales on which the control group ob- 
tained a higher score, this difference being signifi- 
cant at the .02 level of confidence. 
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TorTc L F K Hs D Hy Pd ME Pa Pt Sc Ma Si Tor Tc 


40 40 
35 35 
30 30 


TorTe L F K Hs D By Pd Mf Pa Pt Sc Ha Si Tor Te 


Fic. 1. Composite MMPI profiles of control and 
addict groups. 


From Figure 1 it can also be seen that both 
groups fell near the cutting line on the Ma scale, 
but that the control group did not fall above the 
cutting line on any scale, whereas the experi- 
mental group fell above the cutting line on the 
D, Pd, Pt, and Sc scales. 

Scores on the Ego Strength scale (Barron, 
1953) were practically identical, the slight favor- 
able difference of the control group being insig- 
nificant. Scores on the Anxiety scale (Bendig, 
1956) suggested somewhat more anxiety on the 
part of the addicts, but the difference between 
the addicts and the controls was not reliable. On 
the other hand, scores on the Social Desirability 
scale (Edwards, 1957) showed a higher mean 
score for the control group, this difference being 
significant at the .05 level of confidence. Edwards, 
however, reported greater social desirability effects 
for Pt, Sc, and F, whereas the addicts scored 
highest on D, Pd, and Pt, and it was on these 
scales that the differences between the experi- 
mental and control groups assumed the greatest 
significance. Also, when the scorable items of the 
Social Desirability scale were omitted from D, 
Pd, and Pt, there still remained reliable differ- 
ences between the two groups on these scales, 
Thus, it would appear that although the addicts 
are more willing to admit to socially undesirable 
traits, this tendency was not responsible for the 
differences found between these groups of addicts 
and nonaddicts. Only 4% of the addicts showed 
normal profiles—that is, obtained no scores above 
the cutting line of 70—whereas 27% of the con- 
trol group gave normal profiles. Eighty percent 
of the addicts and 46.7% of the controls gave 
two or more scores above the cutting line. The 
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small percentage of normal personality profiles 
among the narcotic addicts, however, might have 
been expected since this is in close agreement 
with the findings of Prescor (1939, 1943), who 
reported that only 2%-4% of addicts have 
normal personalities, It is not surprising either 
to find such a relatively large number of “ab- 
normal” profiles in the control group, since most 
of these subjects were school dropouts of a 
relatively low socioeconomic level. However, it 
is quite evident that, as a whole, the controls do 
not show the pathology that is common among 
the addicts. 

Comparing the personalities of the two groups, 
test results indicate that individuals in both 
groups recognize problems within themselves, but 
that the addicts may be somewhat more willing 
to admit to socially undesirable characteristics 
than the nonaddicted. Some individuals in both 
groups tend to have disturbed personalities, but 
there seems to be deeper and more widespread 
pathology among addicts than among nonaddicts. 
Both groups tend to be overly sensitive and 
inclined to act out in the face of difficulties or if 
subjected to too much pressure, but these traits 
seem to be more prevalent among addicts than 
among nonaddicts, 

The most outstanding characteristics of the 
addict seem to be his psychopathic traits, He ap- 
pears to be the kind of irresponsible, undepend- 
able, egocentric individual who has a disregard of 
social mores, acts on impulse, and demands im- 
mediate gratification of his wants, He is impatient 
and irritable, lacks the persistence to achieve a 
goal, and he will act out aggressively against 
authority or others who thwart his desires. This 
acting out may then be followed by feelings of 
guilt and depression which can only be alleviated 
by more drugs. He tends to be hypersensitive, 
tense, apprehensive, insecure, and self-conscious, 
and he has trouble forming warm and lasting 
emotional relationships, He becomes depressed 
readily, lacks confidence in himself, has poor 
morale, and finds it difficult to achieve a normal 
optimism with regard to the future; thus, the use 
of drugs may seem to him to be the only realistic 
solution of his problems—at least, it offers him a 
temporary relief from the pain of living. These 
findings, in general, are in agreement with those 
of other investigators. 


Notes AND COMMENTS 


Briefly, most addicts seem to be suffering from 
a basic character disorder, although many also 
have associated psychoneurotic or psychotic traits. 
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LONGITUDINAL STUDY OF SOCIAL CLASS AND 
DEFENSE PREFERENCES 


ALLAN R. WEINSTOCK 
Medical School, University of Wisconsin 


Longitudinal data were used to assess the relationship between childhood 
social class and the development of particular defense mechanisms in adult- 
hood. Childhood social class was correlated with ratings of defense mechanisms 
made when Ss were 30 yr. The results show that denial is negatively 
correlated, while projection and intellectualization are positively correlated, 


with childhood social class, 


Miller and Swanson (1960) have suggested that 
the development of particular defense mecha- 
nisms is related to social class. They reason that 
the harsh physical punishment and minimal re- 
wards experienced by children of working class 
parents will lead them to use fantasies and more 
gross distortions of reality to defend against 
external threats, and they suggest that denial is a 
characteristic defense of children from lower class 
families. On the other hand, the use of symbolic 
rewards and punishments in middle class families 
would lead the child to use more socially adaptive 
defenses such as intellectualization, projection, 
and displacement. 

Miller and Swanson found little direct relation- 
ship between social class and the use of particu- 
lar defenses. However, this absence of clear rela- 
tionships between social class and defenses in 
their research may be attributed to inappropriate 
methods for testing their hypotheses. Miller and 
Swanson used projective measures rather than 
actual behavior to measure defenses; they applied 
their tests in adolescence, when defenses are not 
yet stabilized; they rated social class at adoles- 
cence rather than during the subject’s childhood, 
when the influence of child training practices is 
the greatest, The present investigation is an at- 
tempt to overcome the methodological weaknesses 
in the Miller and Swanson study by employing 
longitudinal data on childhood social class and 
rating defenses from descriptions of actual behav- 
ior in adulthood. 


METHOD 
Sample 


The Ss in this investigation were participants in 
the University of California guidance study. The 
guidance study Ss were a subsample of 248 families 
drawn from the Berkeley Survey, which included 
every third child born in Berkeley between January 
1, 1928 and June 30, 1929. This guidance study 
sample was divided into two matched groups, the 
guidance group and the control group, and followed 
longitudinally to 30 years of age. The guidance group 


originally consisted of 65 males, 61 females, and 
their families. Of the male Ss, 39 returned for ex- 
tensive interviews when they were 30 years old. 
These 39 male Ss comprise the sample used in the 
present study. 


Assessment of Social Class 


The socioeconomic status of the family as of 1928- 
1929 (within a month of the S’s birth) was computed 
by the guidance study staff using the Warner Scale 
(Macfarlane, 1938; Warner, Meeker, & Eies, 1949). 
Occupation, source of income, type of housing, and 
neighborhood were each evaluated on a seven-point 
scale. These scales were then added together with 
weightings of 4, 3, 3, and 2, respectively, for each 
component. 

A previous study indicated that this scale is highly 
correlated with several measures of social status used 
with the guidance sample. On the present sample of 
39 males, the Warner scale had a correlation of .85 
with the California Institute of Child Welfare Index, 
which was applied at S’s birth (Atherton, 1962). 


Assessment of Defense Mechanisms 


A series of intensive office interviews was con- 
ducted with the Ss when they were approximately 
30 years old. Each S was asked to discuss the high 
and low points of his life, to describe his parents and 
compare them to each other and to himself, and to 
repeat this procedure with his wife’s family. The 
interviews were structured to provide information on 
personality, physical characteristics, reactions to 
stress, interests, and social interaction. Detailed notes 
were taken during the interviews and were later 
transcribed and typed. 

The interview material was read by two clinical 
psychologists. The raters, working independently, 
read through the entire case material on an S and 
rated him for his use of each of 10 defense mecha- 
nisms. Each mechanism was rated on an eight-point 
continuous scale. A rating of 8 on a mechanism 
indicated that it was a prominent feature of the 
S’s character structure; a moderate rating indicated 
slight or inconsistent use of the mechanism; a rat- 
ing of 1 indicated its absence in the interview 
material. Sample definitions of two of the defense 
mechanisms rated are given below. 
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TABLE 1 
RELIABILITIES or DEFENSE MECHANISMS 
Defense mechanism Reliability 
Isolation 61 
Intellectualization 80 
Rationalization® 43 
Doubt and indecision 82 
Denial 65 
Projection +63 
Regression 75 
Displacement 53 
Reaction formation* 37 
Repression «17 


Note.—Reliabilities were corrected for attenuation by the 
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TABLE 2 


CORRELATIONS BETWEEN THE WARNER SCALE OF 
SoctaL Status anD Eco MECHANISMS 


Defense mechanism 


r 


Isolation 14 
Intellectualization .36* 
Doubt and indecision AS 
Denial —.35* 
Projection .36* 
Regression .09 
Displacement 313 
Repression 01 


Note.—Correlations have been 
hension; a positive correlation i 


Spearman-Brown formula. 
a Unreliable. 


inverted for easier compre- 
indicates that the use of a 


mechanism is positively related to social status, N = 39. 
*p <05. 


Denial: Denial of facts and feelings that would 
be painful to acknowledge. Basic formula—there is 
no pain, no anticipation of pain, no danger, no con- 
flict. As applied to the past, the formula is that it 
did not happen that painfully at all. Pollyanna 
attitude, In extreme cases, has a magical, oblivious 
quality. 

Projection: A process by which an objectionable 
internal tendency is unrealistically attributed to 
another person or persons in the environment in- 
stead of being recognized as part of one’s self. The 
objectionable tendency that is projected may be 
either an id impulse and any of its derivatives or a 
superego attitude and any of its derivatives. Mani- 
festations—suspicious about what the Institute knows 
about them and about the intentions of the Insti- 
tute; stereotypes about Negro hostility and sexuality; 
feeling of being victimized by one’s boss or fellow- 
workers; perception of the world as a jungle and the 
feeling that one needs to be constantly on guard. 
Readiness to see in personal interactions (either his 
own or those of others) the possibility of being 
made a sucker, These concerns may be used to jus- 
tify retaliatory measures. 

This rating system was developed by Haan (1963) 
and Kroeber (1964) and its validity established in 
relation to projective techniques, the MMPI and the 
CPI, and a number of socioeconomic and intellectual 
measures (Haan, 1963, 1964a, 1964b, 1965). The 
reliabilities of the two raters for the 39 cases used 
in the present study are reported in Table 1. All 
interrater reliabilities which did not approach a 
significance level of .05 were defined as unreliable. 
Ratings of rationalization and reaction formation 
did not meet this criterion. 


Analysis 


The scores assigned by each of the two raters were 
composited into mean ratings for each $ on each 
defense mechanism. These mean ratings on each of 
the eight reliably rated defense mechanisms were 
correlated with the ratings of social class. 


RESULTS AND Discussion 


The correlations between social class and the 
use of each defense mechanism are presented in 
Table 2. Three of the eight correlations are sig- 
nificant at p < .05 (two-tailed test). As hypothe- 
sized by Miller and Swanson (1960), denial is 
negatively correlated, while intellectualization 
and projection are positively correlated, with so- 
cial class. These findings lend support to the 
Miller and Swanson (1960) hypothesis that chil- 
dren of relatively lower class families tend to 
employ a more primitive, global, reality-distort- 
ing means of defending against conflict, while 
higher social status is associated with the child’s 
use of more differentiated and socially adaptive 
defenses requiring greater cognitive skills, De- 
fenses which seem to fall between the extremes 
of this continuum are not correlated with social 
class. Miller and Swanson (1960) have linked 
the acquisition of these mechanisms to the re- 
wards and punishments employed by different 
social classes. While the present study does not 
deal with the means by which these mechanisms 
are passed on, a second research report (Wein- 
stock, 1967) suggests that parental modeling 
rather than reward and punishment plays the 
central role in their acquisition. 
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BRIEF REPORT 


VALIDATION ATTEMPT OF HOVEY’S FIVE-ITEM MMPI INDEX 
FOR CNS DISORDERS + 


LAWRENCE R. MAIER 
Lakeland Mental Health Center, 
Fergus Falls, Minnesota 


In this study cross-validation of a five-item 
MMPI Central Nervous System (CNS) scale 
constructed by Hovey (1964) was attempted. 
Hovey, by using the CNS scale score and the K 
raw score, established two criterion levels: I. 
With a scale score of 4 or 5 and a raw K of at 
least 8, the chances of a person having a brain- 
damage diagnosis would be 86 out of 100; II, 
with a scale score of 4 or 5 and a raw K of at 
least 13, the chances would be 92 out of 100. 

The Ss consisted of 155 patients undergoing 
diagnostic evaluation at Wilford Hall United 
States Air Force Hospital. In each case the ques- 
tion of possible cortical dysfunction had been 
raised, The MMPI, medical charts, and the medi- 
cal criteria for the final diagnostic opinions were 
examined; following this, the sample was divided 
into organic (m= 73) and nonorganic (n= 73) 
groups. Mean ages were 29.8 and 31.2 years, re- 

ı spectively, 

The following hypotheses were made: (1) The 
number of Ss in each group who meet Criterion 
Levels I and II is not independent of the diag- 

snostic group; (2) the observed frequency of or- 
ganic diagnoses in individuals who meet Criterion 
Levels I and II is no different than the expected 
frequencies reported by Hovey; (3) the CNS 
scale leads to fewer diagnostic errors in the eval- 
uation of the presence of cerebral pathology than 
does diagnosis based solely on base rates, 

Chi-square analyses were run comparing the 
number of Ss in each group who met Criterion 


1An extended report of this study may be ob- 
tained without charge from Lawrence Maier, Lake- 
land Mental Health Center, Fergus Falls, Minnesota 
56537, or for a fee from the American Documentation 
Institute. Order Document No. 9573 from ADI Aux- 
iliary Publications Project, Photoduplication Service, 
Library of Congress, Washington, D. C. 20540, Re- 
mit in advance $1.75 for microfilm or $2.50 for 
photocopies, and make checks payable to: Chief, 
Photoduplication Service, Library of Congress. This 
research was conducted while both authors were at 
Wilford Hall United States Air Force Hospital, San 
Antonio, Texas. 
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Levels I and II. No significant differences were 
found. Hovey’s expected frequencies of correct 
organic diagnosis for Ss at Criterion Levels I and 
II were compared with the observed frequencies 
by use of chi-square. For Criterion Level I a 
significant difference was found, with the incidence 
of organicity being much less than the expected 
values (x? = 4.92, p < .05). For Criterion Level 
II a difference significant only at the .10 level 
was found (x?= 2.76). Comparison of the CNS 
scale diagnoses to the base rates (47% organic 
and 53% nonorganic) revealed 80 correct CNS 
scale diagnoses out of 155 decisions versus 73 
correct diagnoses using only the organic base 
rates. In addition, the CNS scale correctly diag- 
nosed about 21% of the organically labeled pa- 
tients, whereas the base rates yielded 47% cor- 
rect diagnoses. Hence, all three hypotheses were 
unconfirmed. 

The composite MMPI profiles of the organic 
and nonorganic groups were essentially the same. 
In contrast, the composite profiles of those 
patients with significant CNS scale scores and 
those without were strikingly dissimilar. Statis- 
tical ¢-test analyses between raw-score means of 
the two groups yielded differences on the Hs, D, 
Hy, Pt, and Sc scales at the .01 level and on the 
Si scale at the .05 level. In all instances the mean 
for the experimental group was higher. 

These data suggest that Hovey’s scale is not 
capable of effectively discriminating between pa- 
tients with and without organic diagnoses. The 
similarity between the organic-nonorganic group 
profiles and the dissimilarity between the signifi- 
cant-nonsignificant CNS scale group profiles sug- 
gests that CNS scores may be more related to 
general psychological disruption, rather than to 
cerebral pathology per se. 
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This study attempted to examine whether chronic schizophrenic patients could 
effectively engage in the manipulative strategy of impression management in 
an evaluative interview situation. The data supported the expectation that 
schizophrenic mental patients can effectively present themselves as “sick” or 


“healthy,” whichever is more suited t 


o their needs and goals. Thus, when the 


patients’ open ward status was questioned, they convincingly presented them- 
selves in the interview as “healthy” and eligible for open ward living; when 


their residency status was questioned, 


they convincingly presented themselves 


as “sick” and ineligible for discharge. These findings were interpreted as 
supporting assumptions of patient effectiveness in implementing goals. 


The present investigation is concerned 
with the manipulative behavior of hospitalized 
schizophrenics in evaluative interview situa- 
tions. More specifically, the study attempts 
to answer the question: Can schizophrenic 
patients effectively control the impressions 
(impression management, Goffman, 1959) 
they make on the professional hospital staff? 

Typically, the mental patient has been 
viewed as an extremely ineffectual and help- 
less individual (e.g., Arieti, 1959; Becker, 
1964; Bellak, 1958; Joint Commiggion on 
Mental Illness and Health, 1961 Bae & 
Freedman, 1966; Schooler & Paskel, 1966; 
Searles, 1965). For example, Redlich and 
Freedman (1966) described the mental pa- 
tient and his pathological status in the fol- 
lowing manner; “There is a concomitant loss 
of focus and coherence and a profound shift 
in the meaning and value of social relation- 
ships and goal directed behavior. This is 
evident in the inability realistically to imple- 
ment future goals and present satisfactions; 
they are achieved magically or through fan- 
tasy and delusion. . . [p. 463].” Schooler 
and Parkel (1966) similarly underline the 
mental patients’ ineffectual status in this 
description: “the chronic schizophrenic is not 


1 The authors would like to express their appreci- 
ation to Doris Seiler and Dennis Ridley for assist- 


ing with the data collection. 


Seneca’s ‘reasoning animal,’ or Spinoza’s ‘so- 
cial animal,’ or even a reasonably efficient 
version of Cassirer’s ‘symbol using animal’. 
. . . Since he violates so many functional 
definitions of man, there is heuristic value 
in studying him with an approach like that 
which would be used to study an alien crea- 
ture [p. 67].” 

Thus, the most commonly held assumptions 
concerning the nature of the schizophrenic pa- 
tient stress their ineffectuality and impotency. 
In this context one would expect schizo- 
phrenics to perform less than adequately in 
interpresonal situations, to be unable to initi- 
ate manipulative tactics, and, certainly, to be 
incapable of successful manipulation of other 
people.” 

In contrast to the above view of the schizo- 
phrenic, a less popular orientation has been 
expressed by Artiss (1959), Braginsky, Grosse, 
and Ring (1966), Goffman (1961), Levinson 
and Gallagher (1964), Rakusin and Fierman 
(1963), Szasz (1961, 1965), and Towbin 
(1966). Here schizophrenics are portrayed in 
terms usually reserved for neurotics and nor- 

2 This statement is explicitly derived from formal 
theories of schizophrenia and not from clinical ob- 
servations. It is obvious to some observers, how- 
ever, that schizophrenics do attempt to manipulate 
others. The discrepancy between these observations 
and traditional theoretical assumptions about the 
nature of schizophrenics is rarely, if ever, reconciled. 
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mal persons. Simply, the above authors sub- 
scribe to the beliefs that: (a) the typical 
schizophrenic patient, as compared to normals, 
is not deficient, defective, or dissimliar in 
intrapsychic functioning; (b) the typical 
schizophrenic patient is not a victim of his 
illness; that is, it is assumed that he is 
not helpless and unable to control his behavior 
or significantly determine life outcomes; (c) 
the differences that some schizophrenic pa- 
tients manifest (as compared to normals) are 
assumed to be more accurately understood in 
terms of differences in belief systems, goals, 
hierarchy of needs, and interpersonal strate- 
gies, rather than in terms of illness, helpless- 
ness, and deficient intrapsychic functioning. 
This orientation leads to the expectation that 
schizophrenic patients do try to achieve 
particular goals and, in the process, effectively 
manipulate other people. 

There is some evidence in support of this 
viewpoint (e.g., Artiss, 1959; Braginsky, 
Holzberg, Finison, & Ring, 1967; Levinson 
& Gallagher, 1964), Furthermore, a recent 
study (Braginsky et al., 1966) demonstrated 
that schizophrenic patients responded, on a 
paper-and-pencil “mental status” test, in a 
manner that would protect their self-interests. 
Those who wanted to remain in the hospital 
(chronic patients) presented themselves as 
“sick,” whereas those who desired to be dis- 
charged (first admissions) presented them- 

* selves as “healthy.” That is, they effectively 
controlled the impressions they wished to 
make on others. Their manipulative perform- 
ance, however, was mediated by an impersonal 
test. 

Therefore, the following question is asked: 
Can schizophrenics engage in similar manip- 
ulative behaviors in a “face-to-face” interview 
with a psychologist? That is, will chronic 
schizophrenics who desire to remain in the 
hospital and live on open wards present them- 
selves in an interview situation when they 
perceive that their open ward status is being 
questioned as (a) “healthy” and, therefore, 
eligible for open ward living, and in another 
interview situation when their residential 
status is being questioned as (b) “sick” and, 
therefore, ineligible for discharge? If so, are 
their performances convincing to a profes- 
sional audience (i.e., psychiatrists) ? 


Benjamin M. Bracinsky anD DorotHea D. Bracinsky 


METHOD 


A sample of 30 long-term (more than 2 con- 
tinuous years of hospitalization) male schizo- 
phrenics living on open wards was randomly se- 
lected from ward rosters. Two days prior to the 
experiment the patients were told that they were 
scheduled for an interview with a staff psychologist. 
Although each patient was to be interviewed indi- 
vidually, all 30 were brought simultaneously to a 
waiting room. Each patient interviewed was not 
allowed to return to this room, to insure that 
patients who had participated would not communi- 
cate with those who had not. 

Each patient was escorted to the interview room 
by an assistant, who casually informed the patient 
in a tone of confidentiality about the purpose of the 
interview (preinterview induction). Patients were 
randomly assigned by the assistant to one of three 
induction conditions (10 to each condition), The 
interviewer was unaware of the induction to which 
the patients were assigned, thereby eliminating inter- 
viewer bias. 


Induction Conditions 


Discharge induction, Patients were told: “I think 
the person you are going to see is interested in exam- 
ining patients to see whether they might be ready 
for discharge.” 

Open ward induction.® Patients were told: “I 
think that the person you are going to see is inter- 
ested in examining patients to see whether they 
should be on open or closed wards.” 

Mental status inductions Patients were told: “I 
think the person you are going to see is interested 
in how, you are feeling and getting along in the hos- 
pital.” 

After greeting each patient the interviewer asked: 
“How are you feeling?” Patients who responded 
only vi porsa descriptions were also asked: 
“How di u feel mentally?” whereas those who 
only gave descriptions of their mental state were 
asked: “How are you feeling physically?” The 
patients’ responses were tape-recorded. The inter- 
view was terminated after 2 minutes,® whereupon 
the purpose of the experiment was disclosed. 


` 3It may be suggested that the open ward induc- 
tion was meaningless, since no patient enjoying open 
ward status would believe that he could be put on 
a closed ward on the basis of an interview. At the 
time this experiment was being conducted, how- 
ever, this hospital was in the process of reorganiza- 
tion, and open and closed ward status was a salient 
and relevant issue. 

* Mental status evaluation interviews are typically 
conducted yearly. Thus, patients who have been in 
the hospital for more than a year expect to be inter- 
viewed for the purposes of determining their resi- 
dency status. 

5 Although, admittedly, psychiatrists would never 
base decisions concerning mental status and dis- 
charge on a 2-minute interview, it was adequate 
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Three staff psychiatrists from the same hospital 
separately rated each of the 30 tape-recorded inter- 
views during two 40-minute sessions. The psychia- 
trists had no knowledge of the experiment, and 
they were unfamiliar with the patients; they 
were told by the experimenter that these were 
mental patients residing in the hospital and that as 
a group they represented a wide range of diagnostic 
categories. 

The psychiatrists rated the patients on the fol- 
lowing dimensions: (a) the patient’s degree of psy- 
chopathology, using a five-point scale ranging from 
“not at all ill” to “extremely ill”; (b) the amount 
of hospital control a patient needed, ranging on an 
eight-point scale from complete freedom (“dis- 
charge”) to maximum control (“closed ward, con- 
tinual observation”); and (c) the structural or 
qualitative aspects of the patient’s speech, such as 
pressure of speech, affect, volume, etc. The score 
for each patient’s speech characteristic was based on 
the sum of the psychiatrist’s rating of 14 Lorr scale 
items (Lorr, 1953), Each item was rated on an 
eight-point scale ranging from not at all atypical to 
extremely atypical verbal behavior. 


Predictions 


If long-term patients are both motivated to 
live on open wards and to remain in the hospital 
and if, in addition, they effectively engage in impres- 
sion management in order to realize these desires, 
then the following would be expected: 

1. Psychiatrists will rate patients in the discharge 
and the mental status conditions as being similar 
with respect to psychopathology and need for hos- 
pital control. Mental status interviews are generally 
used by the hospital to evaluate patients for dis- 
charge; therefore, the mental status and discharge 
conditions offer the same potential consequences 
for patients, Thus, patients in both conditions will 
give the impression of being “sick” and in need 
of hospital control in order to decrease the prob- 
ability of discharge. The purpose of including the 
discharge induction was to present the consequences 
of the interview as explicitly as in the open ward 
induction, 

2. Psychiatrists will rate the patients in the open 
ward condition significantly less mentally ill and less 
in need of hospital control than patients in the dis- 
charge and mental status conditions. That is, pa- 
tients in the open ward condition will give the 
impression of being “healthy” in order to maximize 
their chances of remaining on an open ward. 


for the purposes of this study (namely, to deter- 
mine if mental patients effectively engage in impres- 
sion management), The 2-minute response to the 
Single question provided sufficient information for 
Psychiatrists to form reliable impressions of the 
Patients. Interestingly, the typical mental status 
interview conducted by these psychiatrists is rarely 
longer than 30 minutes. 
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Subjects 


The mean age of the patients was 47.4 years 
(SD =8.36). The mean educational level of the 
group was 8.05 years of schooling (SD =3.44). The 
median length of hospitalization was 10 years. In 
terms of diagnostic categories, 43% of the sample 
was diagnosed as chronic undifferentiated schizo- 
phrenic, 37% as paranoid schizophrenic, 10% as 
catatonic, and the remaining 10% as simple schizo- 
phrenic. There were no differences between the 
three experimental groups on any of the above 
variables. 


RESULTS AND Discussion 


The reliability coefficients of the three 
psychiatrists’ combined ratings of the patient 
interviews were as follows: (a) ratings of 
psychopathology—r = .89, p < .01; (b) need 
for hospital control—r = .74, p< .01; (c) 
normality of speech characteristics—r = .65, 
p< 01. Thus, it was concluded that there 
was significant agreement between the three 
psychiatrists. 

The means of the psychopathology ratings 
by experimental condition are presented in 
Table 1. The ratings ranged 1-5. The analysis 
of variance of the data yielded a significant 
condition effect (F = 9.38, p < .01). The dif- 
ference between the open ward and discharge 
conditions was statistically significant (p < 
01; Tukey multiple-range test). In addition, 
the difference between the open ward and the 
mental status condition was significant (p < 
01). As predicted, there was no significant 
difference between the discharge and mental 
status conditions. 

The means of the ratings of need for hos- 
pital control are presented in Table 1. These 
ratings ranged 1-8. The analysis of these 
data indicated a significant difference be- 


TABLE 1 


MEAN PSYCHOPATHOLOGY AND NEED-FOR-HOSPITAL- 
CONTROL RATINGS BY EXPERIMENTAL 


CONDITION 

Open Mental Diehan 

h ge 
Rating ward status 

M | SD| M |SD| M | SD 


Psychopathology] 2.63 | .58 | 3.66 | .65 | 3.70 | .67 
Need for hospital] 2.83 | 1.15 | 4.10 | 1.31 | 4.20 | 1.42 


control 
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tween the means (F = 3.85, p < .05). Again, 
significant differences (beyond the .05 level) 
were obtained between the open ward 
and the discharge conditions, as well as be- 
tween the open ward and mental status 
conditions. No difference was found between 
the discharge and mental status conditions. 

On the basis of these analyses it is clear 
that patients in the open ward condition ap- 
pear significantly less mentally ill and in less 
need of hospital control than patients in 
either the discharge or mental status condi- 
tions. Obviously the patients in these con- 
ditions convey different impressions in the 
interview situation. In order to ascertain the 
manner by which the patients conveyed these 
different impressions, the following three 
manipulative tactics were examined: (a) 
number of positive statements patients made 
about themselves, (b) number of negative 
statements made about themselves (these in- 
clude both physical and mental referents), and 
(c) normality of speech characteristics (i.e., 
how “sick” they sounded, independent of the 
content of speech). The first two indexes were 
obtained by counting the number of positive 
or negative self-referent statements a patient 
made during the interview. These counts were 
done by three judges independently, and the 
reliability coefficient was .95. The third index 
was based on the psychiatrists’ ratings on 14 
Lorr scale items of the speech characteristics 
* of patients. A score was obtained for each 
patient by summing the ratings for the 14 
scales, 

Ratings of psychopathology and need for 
hospital control were, in part, determined by 
the frequency of positive and negative self- 
referent statements. The greater the fre- 
quency of positive statements made by a 
patient, the less ill he was perceived (r= 
—.58, p < .01) and the less in need of hos- 
pital control (r = —.41, p < .05). Conversely, 
the greater the frequency of negative state- 
ments, the more ill a patient was perceived 
(r = .53, p < .01) and the more in need of 
hospital control (r = .37, p < .05). It is note- 
worthy that patients were consistent in 
their performances; that is, those who tended 
to say positive things about themselves 
tended not to say negative things (r = —.55, 
p< .01). 
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When self-referent statements were com- 
pared by condition, it was found that patients 
in the open ward condition presented them- 
selves in a significantly more positive fashion 
than patients in the discharge and mental 
status conditions. Only 2 patients in the 
open ward condition reported having physical 
or mental problems, whereas 13 patients in 
the mental status and discharge conditions 
presented such complaints (x° = 5.40, p 
< .05). 

The frequency of positive and negative self- 
referent statements, however, cannot account 
for important qualitative components of the 
impressions the patients attempted to convey. 
For example, a patient may give only one 
complaint, but it may be serious (e.g., he re- 
ports hallucinations), whereas another patient 
may state five complaints, all of which are 
relatively benign. In order to examine the 
severity of symptoms or complaints reported 
by patients, the number of “psychotic” com- 
plaints, namely, reports of hallucinations or 
bizzare delusions, was tallied. None of the pa- 
tients in the open ward condition made ref- 
erence to having had hallucinations or de- 
lusions, while nine patients in the discharge 
and mental status conditions spontaneously 
made such reference (x? = 4.46, p < .05). 

In comparing the structural or qualitative 
aspects of patient speech no significant dif- 
ferences were obtained between experimental 
conditions. Patients “sounded” about the 
same in all three conditions. The majority of 
patients (80%) were rated as having rela- 
tively normal speech characteristics. Although 
there were no differences by condition, there 
was a significant inverse relationship (7 = 
— 35, p < .05) between quality of speech and 
the number of positive statements made. 
That is, patients were consistent to the ex- 
tent that those who sounded ill tended not 
to make positive self-referent statements. 

In summary then, the hypotheses were con- 
firmed. It is clear that patients responded to 
the inductions in a manner which maximized 
the chances of fulfilling their needs and goals. 
When their self-interests were at stake pa- 
tients could present themselves in a face-to- 
face interaction as either “sick” or “healthy,” 
whichever was more appropriate to the situa- 
tion. In the context of this experiment “sick” 
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impressions were conveyed when the patients 
were faced with the possibility of discharge. 
On the other hand, impressions of “health” 
were conveyed when the patients’ open ward 
status was questioned. Moreover, the impres- 
sions they conveyed were convincing to an 
audience of experienced psychiatrists. 

One may argue, however, that the dif- 
ferences between the groups were a func- 
tion of differential anxiety generated by the 
inductions rather than a function of the pa- 
tients’ needs, goals, and manipulative strate- 
gies. More specifically, the discharge and the 
mental status conditions would generate more 
anxiety and, therefore, more pathological be- 
havior than the open ward condition. As a 
result, the psychiatrists rated the patients in 
the discharge and mental status conditions as 
“sicker” than patients in the open ward con- 
dition. According to this argument, then, the 
patients who were rated as sick were, in fact, 
more disturbed, and those rated healthy were, 
in fact, less disturbed. 

No differences, however, were found be- 
tween conditions in terms of the amount of 
disturbed behavior during the interview. As 
was previously mentioned, the psychiatrists 
did not perceive any differences by condition 
in atypicality of verbal behavior. On the con- 
trary, the patients were judged as sounding 
relatively normal. Thus, the psychiatrists’ 
judgments of psychopathology were based pri- 
marily on the symptoms patients reported 
rather than on symptoms manifested. Patients 
did not behave in a disturbed manner; rather, 
they told the interviewer how disturbed they 
were, 

The traditional set of assumptions con- 
cerning schizophrenics, which stresses their 
irrationality and interpersonal ineffectuality, 
would not only preclude the predictions made 
in this study, but would fail to explain parsi- 
moniously the present observations. It is quite 
plausible and simple to view these findings 
in terms of the assumptions held about people 
in general; that is, schizophrenics, like normal 
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persons, are goal-oriented and are able to 
control the outcomes of their social encounters 
in a manner which satisfies their goals. 
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If staff decisions regarding patient dispositions are to be maximally beneficial, 
clear communication must occur among staff members. The degree of agree- 
ment among staff members from 8 professional groups (psychologists, social 
workers, psychiatrists, teachers, rehabilitation workers, nurses, and male and 
female technicians) with respect to their patient Perception was investigated. 
48 professionals were asked to describe, using the Sonoma Check List, (1) a 
male and a female patient ready for discharge to a family-care home, and (2) 
a male and a female patient who would benefit from psychotherapy. A corre- 
lational analysis of the 4 ratings indicated high agreement among the 8 groups 
with respect to the Ist 2 ratings, but considerable disagreement on the 2nd 2 
ratings. Both psychiatrists and psychologists seek therapy patients with 
strengths and with pathology; technicians tend to emphasize behavioral man- 


agement; nurses are more concerned with affiliative tendencies. 


The idea of teamwork among staff members 
of different disciplines has long been cher- 
ished. In practice, however, this concept has 
been difficult to implement, since various dis- 
ciplines may perceive the same situation from 
quite disparate viewpoints. Grayson and Tol- 
man (1950) have shown, for example, that 
the semantic concepts of clinical psychologists 
and psychiatrists may be quite different, al- 
though both groups use the same vocabulary. 
Shotwell, Dingman, and Tarjan (1960) found 
that professional personnel attach more im- 
portance to activities directly related to pa- 
tients, whereas psychiatric technicians rated 
as more important those activities not related 
to patients, for example, keeping busy at 
making the ward neat and clean. The same 
study pointed out the difficulties that ad- 
ministrators and professionals encounter in 
carrying out certain treatment philosophies 
when the supporting staff does not share the 
same perception. 

Two important decisions involving a team 
effort are those which concern hospital dis- 
charge of patients judged improved and as- 
signment of patients to more intensive treat- 
ment. This study represents an attempt to 
identify the varying patient perceptions that 
staff members of different disciplines may 
bring to such situations. 


METHOD 


Forty-eight staff members of a large state hospital 
for the mentally retarded participated in this study 


as raters. There were six of each of the following: 
clinical psyhologists, social workers, psychiatrists, 
teachers, rehabilitation workers, registered nurses, 
female technicians, and male technicians (i.e. ward 
personnel). These raters were asked to describe (a) 
a male patient ready for family-care placement (M- 
Pl); (b) a female patient ready for family-care 
placement (F-PI); (c) a male patient who would 
benefit from psychotherapy (M-Rx); and (d) a 
female patient who would benefit from psycho- 
therapy (F-Rx). Each rater worked independently ; 
the sequence of the four ratings was counterbalanced 
among raters. 

The Sonoma Check List (SCL) was used for the 
ratings. The SCL, a list of 210 adjectives, was spe- 
cifically constructed for rating mentally retarded 
patients. Previous studies (Domino, 1965; Domino, 
Goldschmid, & Kaplan, 1964; Goldschmid & Domino, 
1965) have described the construction of the SCL in 
more detail and have demonstrated its usefulness 
for personality ratings of retardates. 


RESULTS 
Intragroup Agreement 


All adjectives checked six, five, one, or zero 
times within each professional group were 
tabulated. These adjectives represent high 
agreement that a particular SCL item is or 
is not applicable. By comparing the total 
number of these adjectives group by group, 
one may obtain an estimate of whether any 
one group shows more or less agreement 
within itself than other groups. An analysis 
of variance indicated no significant differ- 
ences among the groups with respect to intra- 
group agreement. 
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DIFFERENTIAL PATIENT PERCEPTION 


Intergroup Agreement 


A correlational analysis using the Pearson r 
was carried out comparing all ratings of each 
group with those of all other groups. 

For the M-PI ratings the coefficients ranged 
from +.69 to +.86, with a median 7 of +.81. 
Similarly, for the F-Pl ratings the 7’s ranged 
from +.66 to +.86, with a median r of +.76, 
both ratings indicating high agreement as to 
which personality characteristics a male or 
female candidate for family-care placement 
should possess. 

There was less agreement among the eight 
groups with respect to the M-Rx ratings, as 
indicated by a much wider range (7’s from 
+.06 to +.75) and a lower median r of +.40. 
The greatest group differences existed be- 
tween the ratings by psychiatrists and female 
technicians (+.06) and between psychia- 
trists and teachers (+.09). Furthermore, the 
more professional groups (psychologists, social 
workers, and psychiatrists) tended to agree 
more with each other than with the other 
groups. It should be noted that psychiatrists 
showed the least agreement with the com- 
bined rating for all groups (+.40), whereas 
male technicians showed the highest agree- 
ment (+.85). Similar results were obtained 
for the F-Rx ratings. 


Intragroup Concept Comparisons 


A third meaningful question is that of 
whether, within each professional group, the 
members view patients for family-care place- 
ment and patients for therapy in similar ways. 


TABLE 1 


INTERCORRELATIONS BETWEEN M-Px ann M-Rx 
Rares AND F-Pt AND F-Rx RATINGS 


Rat M-P! vs. F-Pl vs. 
ater M-Rx F-Rx 

Psychiatrist Dies Ar 
Psychologist bb —-10 
Registered nurse 02 —.21"* 
Social worker —.34* 05 
Female technician —.44* —.29%* 
Male technician —.50* —.40** 
Teacher —.52** Sa 
Rehabilitation worker —.50** — 52" 


Note.—All n = 210, 
at < 05. 
p <.001. 
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Table 1 lists the correlation coefficients ob- 
tained by comparing each of the eight groups’ 
ratings of M-Pl with M-Rx and F-P! with 
F-Rx. Marked group differences can be ob- 
served: Rehabilitation workers, for example, 
tended to describe these two patient types in 
opposite ways, while psychiatrists chose simi- 
lar SCL items for both. 


SCL Composites of Four Patient Types 


A composite personality description of each 
patient type was obtained by selecting those 
8 to 12 adjectives which had been checked 
four, five, or six times within a group—these 
frequencies representing substantial agree- 
ment as to the applicability of a specific SCL 
item. 

Ratings on family-care placement were com- 
bined for all eight groups, since, as indicated 
previously, these ratings were in substantial 
agreement. 

Adjectives checked most frequently for the 
male patient ready for family-care placement 
were: friendly, good-natured, likeable, pleas- 
ant, polite, reasonable, reliable, responsible, 
self-controlled, trainable, well-behaved, and 
well-oriented. The adjectives checked most 
frequently for the female patient were essen- 
tially the same, all reflecting socially desirable 
characteristics. 

Ratings on psychotherapy were not com- 
bined, since the eight groups exhibited varying 
degrees of disagreement. 

Psychiatrists checked the following SCL 
items most frequently for M-Rx: active, 
adaptable, aggressive, depressed, educable, 
imaginative, inhibited, interesting, intelligent, 
organized, responsive, and withdrawn. Female 
technicians, on the other hand, described the 
M-Rx as confused, depressed, disobedient, 
hostile, irritable, moody, quarrelsome, sulky, 
suspicious, stubborn, violent, and as exhibiting 
temper tantrums. These two sets of ratings 
represent the greatest amount of disagree- 
ment. 

Psychologists described the M-Rx as ag- 
gressive, anxious, cooperative, depressed, edu- 
cable, emotional, excitable, inhibited, and 
tense. Social workers rated the M-Rx as ag- 
gressive, anxious, confused, depressed, fear- 
ful, hostile, inhibited, nervous, odd, rebellious, 
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and worrying. Registered nurses saw the M- 
Rx as aggressive, defiant, destructive, hostile, 
inhibited, secretive, show-off, and unsociable. 
Since there were no marked sex differences, 
the above adjectives apply equally well to 
F-Rx. 


DISCUSSION 


The high degree of agreement among the 
eight professional groups with respect to their 
ratings of a patient ready for family-care 
placement suggests that members of different 
disciplines view this situation in essentially 
the same terms, Since the referral process for 
family-care placement may begin with any 
one professional person, but typically involves 
a team effort, it is reassuring to note the high 
intergroup agreement. 

The ratings for the patient who would bene- 
fit from psychotherapy, on the other hand, 
indicate that members of various disciplines 
do not view the therapeutic process in the 
same light. Psychiatrists, for example, seek 
traits representing therapeutic potential (in- 
telligent, interesting, active, adaptable) and 
characteristics reflecting pathology (de- 
pressed, inhibited, withdrawn). Ward techni- 
cians nominate patients with visible behavioral 
disturbances (confused, depressed, moody) 
that present management problems (disobedi- 
ent, has temper tantrums, violent). Psycholo- 
gists, like psychiatrists, look for relative 
strengths (educable, cooperative, responsive) 
as well as pathological trends (anxious, de- 
pressed, tense). Nurses tend to rate patients 
who could benefit from psychotherapy as aso- 
cial and difficult to reach (defiant, secretive, 
unsociable), 

These differences in patient perception 
probably reflect each group’s training and 
specific concern with everyday, routine be- 
haviors (e.g., psychodynamics for psychia- 
trists, behavioral problems for technicians, 
affiliative tendencies for registered nurses). In 
this respect, our results seem to parallel those 
obtained by Shotwell et al. (1960) in that 
there are, in fact, differences in perceptions 
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among professional groups and between pro- 
fessional and supporting staffs. 

Differential perceptions of the therapeutic 
process may result not only in friction be- 
tween various disciplines but also in a lower- 
ing of each discipline’s effective functioning. 
If technicians and other supporting staff 
could gain a more realistic understanding of 
the potentials of psychotherapy by means of 
staff conferences, in-service training, or other 
devices, therapeutic referrals might become 
more appropriate. On the other hand, since 
the supporting staff is most concerned with 
patients who present management problems, 
therapeutic efforts might also be profitably 
applied to short-term behavioral changes. 

Finally, with respect to future research, 
brief mention might be made of the possible 
usefulness of the SCL for the practical prob- 
lem of selecting patients for family-care 
placement, psychotherapy, or other thera- 
peutic dispositions. For example, if all pa- 
tients placed in family care or referred for 
psychotherapy were rated on the SCL, it 
could be determined which patient types were 
more successfully placed and which resulted 
in therapeutic “success.” These ratings could 
then serve as criterion ratings for future pa- 
tient selections. 
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USE OF A STRUCTURED AUTOBIOGRAPHY IN THE 
CONSTRUCT VALIDATION OF PERSONALITY 
SCALES * 


ARTHUR H. HILL? 


University of Texas 


Structured autobiographies were written by 352 male freshmen. The products 
were content analyzed, and the responses of the high and low scorers 
were compared by chi-square on each of 6 CPI scales (Do, Re, So, Ac, 
Ai, Ie). Areas compared were relationships with father and mother, childhood, 
adolescence, current circumstances, self-description, and work-study habits. The 
results provide evidence for the validity of the manual’s description of Do, 
So, Ac, Ie, and for a general concept of independence on the Ai scale. The 
limited evidence on Re is equivocal. A structured autobiography is seen as a 
useful additional source of construct validation of personality scales. 


During the years that have elapsed since 
the American Psychological Association’s 
publication of its Technical Recommendations 
(1954), which outlined four types of validity, 
most attention has been paid to construct 
validity. The discussions of Cronbach and 
Meehl (1955), Loevinger (1957), Bechtoldt 
(1959), Jessor and Hammond (1957), Camp- 
bell and Fiske (1959), and Campbell (1960) 
have served to clarify and elaborate the con- 
cept, and Campbell’s (1960) paper gives a 
well-balanced presentation of its values, diffi- 
culties, and common  misunderstandings. 
Briefly, construct validation of a measuring 
instrument consists of demonstrating that ob- 
servable variations in postulated attributes or 
traits are reflected in test performance. A 
broader and more abstract behavioral descrip- 
tion than is available with other types of vali- 
dation is sought. For example, construct vali- 
dation of a personality scale entitled “Domi- 
nance” would include showing that | high 
scorers on the scale have had more experiences 
which theory suggests lead to the develop- 
ment of the trait of dominance and, further, 
that behavior defined as dominant is more 
frequently observed in these individuals. 
Autobiographical data throwing light on the 


1Based on a paper presented at the meeting of 
the American Psychological Association, September 
1965. 

2Now at American Institutes for Research, Pitts- 
burgh. The data on which this study is based pos 
collected by George G. Gonyea prior to his death in 
January 1964. Coding, analysis, and interpretation 
is the work of the present author. 


conditions affecting the development of the 
trait under consideration contribute to the 
process of construct validation. The aim of 
the present study is to demonstrate the use- 
fulness of a structured autobiography in the 
construct validation of personality scales. 


PROCEDURE 


The Ss were 352 male freshmen at the University 
of Texas. Each S wrote an autobiography following 
a structured outline 8 such that information was ob- 
tained about the following: his relationships with 
each of his parents and siblings, his childhood, ado- 
lescence, current circumstances, self-image, and work 
and study habits. A sample of the instructions to 
tap the individual’s perception of his father is as 
follows: 


Write a descriptive sentence or two about each of 
the following: His temperament. His attitude 
toward discipline and type of punishment used. 
His efficiency as a person, Type of relationships 
with other people. Type of relationship with you. 
What influences on your life? 


To increase the probability of accurate self-report- 
ing, randomly ordered code numbers were assigned 
and it was stressed to the Ss that the material was 
confidential and would be used only by a research 
staff, Content analysis of the autobiographies in- 
cluded ratings of positive or negative affect, cate- 
gorizing of such things as type of punishment used 
by each parent, and counting of the number of 
favorable or unfavorable adjectives used in certain 
kinds of descriptions. Table 1 lists the contents con- 
sidered. All analyses were performed by one gradu- 
ate research assistant.t A random sample of 20 indi- 
vidual autobiographies was recoded, and correlation 


3 Developed by Austin E. Grigg. 
4Thanks are extended to Alan Appelbaum for his 
painstaking work on this analysis. 
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of the results of the original coding with the recod- 
ing yielded a reliability coefficient of +.91. 

The Ss also completed an abridged form of the 
California Psychological Inventory 5 which included 
six scales—Dominance (Do), Responsibility (Re), 
Socialization (So), Achievement via Conformity 
(Ac), Achievement via Independence (Ai), and In- 
tellectual Efficiency (Je)—all demonstrated to be 
significantly related to academic achievement among 
college students (e.g, Gough, 1953; Gough, 1957; 
Holland, 1959; Holland & Astin, 1962). For each of 
these six scales, groups were formed consisting of 
the 27% of individuals whose scores were highest 
and the 27% whose scores were lowest. These groups 
were compared by chi-square on each of the varia- 
bles resulting from the content analysis of the 
autobiography. Gough’s (1957) descriptions of the 
scales were used as the basis for deciding whether the 
information resulting from these analyses provided 
construct validation. 


RESULTS 


The results are consistent with the man- 
ual’s descriptions of four of the six variables 
measured. The construct of Ai received par- 
tial support from the data, and clear support 
appeared for a general concept of independ- 
ence. The evidence on Re is much more lim- 
ited, and it is not possible either to support 
or refute the concept on this basis. Table 1 
shows the items in each of seven areas which 
differentiate at beyond the .05 level of sig- 
nificance between high and low scorers, 


Dominance 


The scale’s purpose, as described in the 
manual, is “to assess factors of leadership abil- 
ity, dominance, persistence, and social initia- 
tive.” High scorers tend to be seen as “ag- 
gressive, confident, persistent, and planful; as 
being persuasive and verbally fluent; as self- 
reliant and independent; and as having lead- 
ership potential and initiative,” 

The single item concerning “father” which 
significantly differentiated the high Do group 
from the low was that the high scorers re- 
ported their fathers to be better educated 
(8% with less than high school graduation 
compared with 18% for the lows). Highs re- 
ported their mothers as being less often lenient 
in their attitudes toward discipline, as being 


5 Copyright 1956 by Consulting Psychologists 
Press, Inc, Palo Alto, California. Appreciation is 
expressed to H. G. Gough and to J. D. Black for 
permission granted to George G. Gonyea to ab- 
stract and reproduce CPI scales, 


more efficient as persons, and as having a very 
positive influence on their sons’ lives. Con- 
cerning their childhoods, the high Do scorers 
more often reported having many playmates, 
being of high status among their friends, and 
used more favorable and less unfavorable ad- 
jectives to describe themselves as children, 
Concerning adolescence, the highs more often 
said they got along well with other boys. 
There was a similar, though not statistically 
significant, trend in describing their relation- 
ships with girls during adolescence. Of the 
low Do scorers, 18% reported that they were 
unpopular as adolescents, compared with only 
1% of the highs; again, as adolescents, the 
highs perceived themselves as being of higher 
status among their friends, In describing their 
current circumstances the high Do’s reported a 
greater amount of social life and more often 
said they were satisfied with their social lives, 
When asked to describe themselves the highs 
used significantly more positive adjectives and 
less negative adjectives, a fact which was re- 
peated in describing their social patterns and 
work and study habits. Finally, when asked to 
list their major strengths—the characteristics 
which someone would value highly about them 
—the highs produced a significantly greater 
number. 

The picture that emerges of the high scorer 
on Do is of a fluent, popular, high-status, 
self-confident individual who generally can be 
categorized as aggressive, It is interesting to 
note that these boys, who are dominant in a 
higher education setting, are the sons of 
better-educated fathers. Thus, there is good 
agreement between the attributes reported in 
the autobiographies of high scorers on the Do 
scale and those which would be predicted 
from the manual’s description. 


Responsibility 


The scale’s purpose, as described in the 
manual, is “to identify persons of conscien- 
tious, responsible, and dependable disposition 
and temperament.” High scorers tend to be 
seen as “planful, responsible, thorough, pro- 
gressive, capable, dignified, and independent; 
as being conscientious and dependable; re- 
sourceful and efficient; and as being alert to 
ethical and moral issues.” 

The only parent-related variable which dif- 
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TABLE 1 
Summary or CPI SCALE Comparisons ON AUTOBIOGRAPHY VARIABLES 
Variable Do Re So Ac Ai Ie 
Father _ 
Education More* Less* More* 
Temperament Positive* | Positive*** 
Attitude to discipline Moderate* 
‘Type of punishment Moderate* Not privilege deprivation* 
Efficiency as a person High* 
Relations with other people 
Relations with you Closet Close** 
Influence on your life Positive** 
Mother 
Education Less* Less* More* 
Temperament Positive*** | Positive** 
Attitude to discipline Strict* Moderate* 
Type of punishment Not “‘none’* Not “‘none"* 
Efficiency as a person High* High* Low** 
Relations with other people Poor* 
Relations with you Closet Close*** W 
Influence on your life Positive**| Positive** | Positivet** | Positive* Positive* 
Childhood 
Ordinal position Firstborn** 
Relations with sisters Good** Good** 
Relations with brothers 
No. playmates Man; 
Peer status High*** 
Special problems Less** 
Description of self as child: | ee 
o. good adjectives ‘ore’ 
No, bad adjectives Less* Less* Less* Less* 
Adolescence 
General description Happy** 3 Happy** 
Got along with boys Well* Well pon 
Got along with girls Littlest Little* 
Amount of dating ¢ pe 
Popularity Hight Lowe 
Peer status High*** di 
Current circumstances 
Social life, amynt auth More, 
Satisfaction with social life igl x 
Things enjoyed most aid Nonsocial***| Cultural’ 
Things that trouble you most Nonsocial Nears 
Distance from goals 
Self-description Bias 
‘emperament or disposition: see More** 
No. favorable adjectives | More®* ot puna Lene 
No. unfavorable adjectives | Less** | Less’ 
oe pee i É More** 
o. favorable adjectives ‘ore’ * 
No. unfavorable adjectives | Less** Less** oe 
Work and study habits oa More 
No. favorable adjectives Moret | Moret | Morey | Morgi 
No. unfavorable adjectives Less* 
No, major strengths More* Less* 
No. major weaknesses 


Note.—Entries show the characteristics associa 
(bottom 27%). 
*p <05. 


pH < 01. 
> < 001. 


ferentiated high and low Re scorers was that 
highs reported their mothers as having been 
a greater positive influence on their lives. The 
things that currently troubled them most 
were nonsocial in nature. In describing them- 
selves highs used fewer negative adjectives, 
and in characterizing their work and study 
habits they used more positive adjectives. 

On the Re measure high and low scorers 
did not differ on enough of the autobiograph- 
ical variables nor with sufficient consistency 


ted with high scores (top 27%) on each scale as compared with low scores 


to lend support to the construct for which 
the scale is named. It may be that the ques- 
tions asked were not sufficiently well directed 
to enable the evidence to emerge, and the re- 
sults may point more to a lack of positive 
support than to any basis for rejection of the 
scale as a measure of the construct. 


Socialization 


The scale’s purpose, as described in, the 
manual, is “to indicate the degree of social 


554 


maturity, integrity, and rectitude which the 
individual has attained.” High scorers tend to 
be seen as “serious, honest, industrious, mod- 
est, obliging, sincere, and steady; as being 
conscientious and responsible; and as being 
self-denying and conforming.” 

High scorers reported their fathers as hav- 
ing less education (21% with less than high 
school graduation compared with 10% for the 
lows, 35% with college degrees compared with 
49% for the lows). The highs also described 
their fathers’ temperaments in more positive 
terms, their fathers’ relationships with them 
as closer, and their fathers’ influence on their 
lives as more positive. The mothers of the 
highs also had less education. The highs de- 
scribed their mothers’ temperaments as more 
positive, their attitudes toward discipline as 
less strict—and yet they were less frequently 
reported as using no punishment at all; their 
efficiency as people was reported as high, their 
relations with their sons as closer, and their 
influence on their sons’ lives as more positive. 

The high scorers on So more frequently 
reported that as children their relations with 
their sisters had been good and that they had 
less frequently encountered special family 
problems or traumatic events (such as death 
or divorce). High scorers also used fewer 
negative adjectives in describing themselves 
as children. Not a single variable in the 
autobiography which was related to ado- 
lescence or current circumstances differenti- 
ated the high So’s from the lows. In charac- 
terizing their social patterns the highs used 
less unfavorable adjectives, and in describing 
their work and study habits they used more 
positive and less negative adjectives. 

The large number (and the kind and di- 
rection) of parent-related variables which 
differentiated the high and low scorers on So 
is congruent with the manual’s scale descrip- 
tion. From the autobiographical data it seems 
teasonable to describe the high scorers as con- 
forming, modest, and industrious, all at- 
tributes reasonably found in the sons of less 
well-educated parents who are seeking up- 
ward mobility through education. 


Achievement via Conformity 


The scale’s purpose, as described in the 
manual, is “to identify those factors of in- 
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terest and motivation which facilitate achieve- 
ment in any setting where conformance is a 
positive behavior.” High scorers tend to be 
seen as “capable, co-operative, efficient, or- 
ganized, responsible, stable and sincere; as 
being persistent and industrious; and as valu- 
ing intellectual activity and intellectual 
achievement.” 

High scorers on Ac describe their fathers’ 
temperaments more positively and report 
their fathers as having closer relationships 
with them. Their mothers have less education, 
are described as having more pleasant tem- 
peraments, as having closer relationships with 
them, and as having a more positive influence 
on their lives. They report better relationships 
with their sisters as children and use fewer 
negative adjectives in describing themselves 
as they were at this age. Concerning adoles- 
cence, the highs less frequently describe them- 
selves as having been unhappy and report 
they got along better with other boys. In their 
current circumstances the high Ac scorers re- 
port themselves as enjoying cultural and in- 
tellectual pursuits more, and see themselves 
as nearer to their goals. In describing them- 
selves the highs use more favorable and less 
unfavorable adjectives, as is true in describ- 
ing their work and study habits. They also 
describe themselves as having fewer major 
weaknesses. 

The emerging image is of an industrious, 
efficient, cooperative, organized, fluent, and 
realistic person, very close to the description 
given by the manual. 


Achievement via Independence 


The scale’s purpose, as described in the 
manual, is “to identify those factors of in- 
terest and motivation which facilitate achieve- 
ment in any setting where autonomy and in- 
dependence are positive behaviors.” High 
scorers tend to be seen as “mature, forceful, 
dominant, strong, demanding, and foresighted; 
as being independent and self-reliant; and as 
having superior intellectual ability and judg- 
ment.” 

High scorers on Aż report their fathers as 
better educated (54% having a college degree 
compared with only 37% for the lows). The 
type of punishment used by father was de- 
scribed as deprivation of privileges less fre- 
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quently by the highs than by the lows. The 
highs also less frequently reported their moth- 
ers’ efficiency as people as being high and 
their relationships with other people as good. 
The high scorers were more often firstborn or 
only children than were low scorers. During 
adolescence high Az seemed a cause of severe 
problems. The highs got along less well with 
other boys and girls, dated less, were less 
popular, and less often reported their peer 
status as high. They used more unfavorable 
adjectives to describe their current social pat- 
terns. Interestingly, there was no difference 
between the high and low scorers in their 
descriptions of their work and study habits 
or their major strengths and weaknesses. 

The high scorer on Ai appears to be a 
lonely person, without warm ties to his fam- 
ily, experiencing an isolated adolescence and 
an unsatisfactory social life. Although this 
does not give too much support to the descrip- 
tion in the manual, it certainly fits well with 
a theory-derived picture of an independent 
individual, especially in his inability to meet 
the conforming pressures of adolescence. 


Intellectual Efficiency 


The scale’s purpose, as described in the 
manual, is “to indicate the degree of personal 
and intellectual efficiency which the individual 
has attained.” High scorers tend to be seen as 
“efficient, clear-thinking, capable, intelligent, 
progressive, planful, thorough, and resource- 
ful; as being alert and well-informed; and as 
placing a high value on cognitive and intel- 
lectual matters.” 

High scorers on Je characterized their fath- 
ers’ attitudes toward discipline as moderate, 
rather than either strict or lenient, more fre- 
quently than did the low scorers. Their moth- 
ers were better educated (43% with college 
degrees compared with 29% for the lows) and 
were more likely to take some part in punish- 
ment (15% of the highs reported no punish- 
ment by mothers compared with 29% of the 
lows). Highs rated their mothers as having a 
more positive influence on their lives. They 
used fewer negative adjectives to describe 
themselves as children; were less often un- 
happy as adolescents, although they dated 
less often; enjoyed cultural and intellectual 
pursuits more; used more favorable and less 
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unfavorable adjectives to describe themselves; 
and used more favorable adjectives to de- 
scribe their work and study habits. 

A capsule description of the high scorers as 
organized, efficient, and culturally and intel- 
lectually committed can be derived from their 
autobiographies. This description is consistent 
with the manual’s description of intellectual 
efficiency. 


Discussion 


Autobiographical descriptions by high scor- 
ers on five of the six personality scales exam- 
ined are consistent with the descriptions pro- 
vided in the manual. This agreement is seen 
as providing a measure of construct validation 
for these scales. Evidence for the sixth scale 
was neutral rather than negative. 

The possibility that the results may be 
understood in terms of a single construct such 
as social desirability has been considered. It 
is true that many of the autobiographical 
findings revolve about the fact that the sub- 
ject describes himself or his father or mother 
in favorable or unfavorable terms. Yet, there 
are clear differences in the groups ranked high 
or low on each of the variables, Careful exam- 
ination of the traits which emerge from the 
different groups (e.g., high So’s are conform- 
ing, modest, and industrious; high Do’s are 
fluent, popular, high-status, self-confident) 
suggests that a single variable such as social 
desirability is inadequate to account for the 
richness and variety in the present data. 
Rather, it is inferred that the analysis of a 
structured autobiography provides a useful 
method of construct validation of personality 
scales, as well as a source of new insights into 
the qualities of the group which is being 
studied. 
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A classification system, designed to focus on 2 aspects of client behavior, 
voice quality and expressive stance, was applied to interviews of 53 clients. 
The behavior thus coded was vector analyzed by columns, yielding factor 
loadings for interviews for each client for 1st, 2nd, and 11th interviews. Esti- 
mated factor loadings were obtained for 12 attrition clients for 1st and 2nd 
interviews. Attrition could be predicted from 1st interview process, For the 2nd 
and 11th interviews, 2 of the 4 factors extracted were found to differentiate 
among outcome groups formed by combining client’s and therapist’s vantage 
points. The authors discuss the value of these client-process variables not only 
for prognosis, but also as a background against which to evaluate therapist 


participation or experimental intervention. 


Recent surveys of the literature on the out- 
come of psychotherapy have concluded that 
if we are to make sense of the ambiguous, 
often conflicting findings in this area, it will 
be necessary to include in our equations 
meaningful descriptions of the therapy proc- 
cess that intervenes between initial and post- 
therapy assessment (Cross, 1964; Dittman, 
1966). We cannot depend simply on actuarial 
variables, such as number and length of ses- 
sions or orientation and length of the experi- 
ence of the therapist, but we must be able to 
distinguish those aspects of the process that 
may be related to substantial favorable change 
or even, as Bergin (1963) has suggested, to 
unfavorable change. 

Accurate characterization of the process 
engaged in by the client seems particularly 
basic for a number of reasons. In the first 
place it has been demonstrated repeatedly 
that in accounting for differential results a 
substantial part of the variance lies in the 
client, though the distinctions are not usually 
made along standard nosological lines. The 
Rorschach study of Endicott and Endicott 
(1964) was a vivid demonstration that any 
attempt to evaluate the contribution of the 
therapist must take into account the differ- 
ing resources of clients for engaging in a 
therapeutic process. Furthermore, it seems 
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probable that the effects of different kinds 
of therapist intervention can best be evalu- 
ated against the backdrop of the ongoing 
client process. Particular therapist behaviors 
or other experimental variations are so remote 
in time from the actual assessment of therapy 
outcome, with so much intervening interac- 
tion, that their effects are problematic without 
some more immediate index such as change 
in client process. In the case of therapy- 
analogue studies there is usually no “thera- 
peutic outcome” to assess, but an index of 
“client” interview process would be highly 
relevant. 

If any client-process measures are to be 
considered sturdy enough to serve this func- 
tion of dependent variable in investigations 
of the conditions of the therapy interaction, 
they must be firmly anchored to outcome 
indexes. That is, we must “validate” the proc- 
ess variables by demonstrating that the client 
behaviors involved are crucial in psycho- 
therapy. The literature on content analysis 
contains no lack of suggested client-process 
measures. The shortage is, rather, one of 
variables that have been sufficiently well 
tested to serve as tools for the investigation 
of the subtle phenomenon of psychotherapy. 

The present paper has two purposes. The 
first is to describe in some detail a classifica- 
tion system for characterizing client behavior 
during the therapy hour in terms of its sty- 
listic aspects. The system is designed to yield 
process variables that can be used either as a 
moment-by-moment index of the level of cli- 
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ent participation or as a means of character- 
izing a total hour. The second purpose is to 
explore, for a substantial sample of clients, 
relationships between these process variables 
and amount of favorable personality change 
after a series of interviews. 


METHOD 
Client Classification System 


The system used for classifying and coding client 
behavior was designed to meet criteria which were 
discussed more fully by Wagstaff (1959) and Rice 
(1965), but which are summarized as follows: (1) 
It was designed to focus on stylistic aspects of the 
client’s participation rather than on specific content. 
(2) It was intended to distinguish the moment-by- 
moment behavior of the client rather than being 
used to characterize a total hour. (3) The distinc- 
tions were to be made as much as possible on the 
behavioral level with a minimum of inferences 
about meaning. Client responses classed together were 
to be similar as behavior, rather than judged to be 
similar on conceptual grounds. In other words, the 
system was intended to be descriptive rather than 
containing built-in explanations. (4) The system 
focused on aspects of behavior that gave promise 
of being important components of the client’s at- 
tempts to express himself, picking up the vehicles 
of self-exploration rather than its concomitants. 
Also, an attempt was made to select aspects likely 
to have impact on the therapist apart from the 
content of the communication. (5) Finally, of course, 
the coded behaviors were to be quantifiable in such 
a way as to permit scrutiny of relationships among 
behaviors and between behaviors and independent 
measures of therapy outcome. 

Inasmuch as the client is sending communications 
to the therapist on a variety of different levels, 
from the kinesic (body motion) level to that of the 
most complex meanings, it would be desirable to 
classify client behavior on a variety of different 
levels, eventually characterizing a given client re- 
sponse in terms of a combination of levels. The 
present classification system addresses itself to two 
levels of communication. The first concerns voice 
quality, that which is vocal but nonverbal. The 
second is lexical, having to do with meanings con- 
veyed through words, but focusing on the manner 
of expressing experience rather than specific content. 

This classification system has been developed 
within the client-centered framework, in which the 
primary task of the therapist is to help the client 
to engage in a process of self-exploration with as 
much freshness and immediacy as possible. Some 
adaptations would be necessary before using this 
system to characterize therapy process in another 
orientation. It seems probable that the vocal dimen- 
sion would be relevant in any situation involving 
self-exploration in an interpersonal context. The 
lexical dimension, expressive stance, is probably more 
specific to a therapy in which both therapist and 


client stay primarily in the client’s internal frame 
of reference. 


Aspect A: Voice Quality 


In line with the decision to classify style of par- 
ticipation rather than specific content, no attempt 
was made to distinguish particular emotional states, 
The strategy used was to locate a limited number 
of voice patterns that varied among clients, showed 
variation over sessions, and seemed to differentiate 
meaningfully among sessions that had previously 
been characterized by therapists as good or poor 
hours. The patterns thus located were then described 
in terms of energy, pitch range, tempo, stress pat- 
terns, etc. These descriptions, together with repre- 
sentative taped samples were used by the judges 
in making their distinctions.? It seems probable that 
these voice patterns have to do with the degree and 
nature of the client’s involvement in the therapy 
process. The four subclasses include qualitatively dif- 
ferent kinds of voice patterns and are not intended 
to form a scale. 

Emotional. Responses placed in this first subclass 
may take a number of different forms, but in gen- 
eral there is energy overflow rather than control. 
The voice breaks, trembles, or chokes. The general 
impression is one of disruption of the usual voice 
patterns with varying degrees of effort at control. 

Focused. These responses are characterized by a 
good deal of energy, but not by a wide pitch fluctu- 
ation. There are irregularities in the stress of syl- 
lables, and stresses are not usually accompanied by 
much pitch rise. There are marked irregularities of 
tempo. Impressionistically, the total effect is one of 
pondering, of energy turned inward in an exploring 
fashion. 

Externalizing. These responses are characterized by 
comparatively high energy and by a wide pitch 
range in the sense defined by Trager (1958). There 
is an unusually regular stress pattern, with the heavy 
stresses accompanied by a rise in pitch. This stress 
pattern, together with the presence of terminal con- 
tours that rise or fall in unexpected places, gives 
an effect of cadence or preformed pattern. The total 
effect is one of energy turned outward, a “talking 
at” quality. 

Limited. Responses placed here are characterized 
by low energy, a narrow pitch range, and an even 
tempo. The stress pattern is typical for English, but 
the stresses themselves are relatively weak. The voice 
is thinned from below. The general impression one 
gets is that of limited involvement, of distance from 
what is being expressed. 


Aspect B: Expressive Stance 


The second main class, the lexical, has been desig- 
nated expressive stance because it focuses on the 
stance that the client takes in relation to whatever 


2A study by Duncan (1966) indicated that a 
more microscopic characterization of these patterns 
by means of suprasegmental and paralinguistic tran- 
scription would yield further interesting subdivisions. 
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he is discussing. Here again the four subclasses are 
not intended to represent a scale. 

Objective analysis and description. The client may 
be discussing himself or things outside himself, but 
in either case it is as if he steps outside and views 
himself and the world as objects to be described, 
categorized, or analyzed. Example: “Whenever I get 
into difficult situations, I’m likely to make a mess 
of them.” 

Subjective reaction. The client is focusing on the 
subjectivity of his own reaction to things impinging 
on him. He is dealing not with generalities, but 
with an immediate subjective response to a specific 
stimulus. Example: “When he did that, I felt 
cheated, like I was being used.” 

Static feeling description. The client is discussing 
feelings, but in a static, objectified manner, as things 
to be reported, labeled, or explained. Although feel- 
ings are subjective by definition, they are dealt with 
here in a generalized or analytical fashion that is 
similar to the form of the first subclass. Example: 
“This feeling of anger is one I have had since 
childhood.” 

Differentiated exploration. Here the client explores 
an inner experience in an immediate and differenti- 
ated fashion without subjecting it to cognitive 
operations. He focuses on the idiosyncratic qualities 
of his experience, often in highly sensory and expres- 
sive language. Example: “I felt utterly flat, emptied 
out.” 


Rating Method 


The rating unit used was the total response, de- 
fined as everything said by the client between two 
therapist responses. Each interview was divided into 
thirds on the basis of elapsed time, and 10 consecu- 
tive client responses were taken from each third. 
There were thus 30 client responses from each 
interview. These responses were classified separately 
on each of the two aspects, with each response being 
placed in one and only one subclass of each aspect. 

All classifications were based on hearing the taped 
responses; no transcripts were used. Interviews were 
coded so that judges knew nothing about the case. 
Judges were instructed not to listen to the following 
therapist response and to listen to the preceding 
therapist response only when necessary to understand 
antecedents to pronouns, etc. Classification was done 
by graduate students with some knowledge of client- 
centered therapy. Each tape was independently rated 
by two judges, with a third independent judge to 
break ties. 


Reliability 

Earlier work with this kind of material has shown 
that interjudge agreement can be raised by means 
of additional training even after a level had been 
reached that clearly exceeded chance beyond the .05 
level of significance. Therefore, for each of the 
main classes a standard was set which had as a 
lower bound a .05 level of significance using Cohen’s 
kappa (Cohen, 1960) and which had as an upper 
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bound the degree of agreement reached by the two 
most reliable judges. A judge was considered trained 
when he reached this degree of agreement with the 
standard judges on three successive training tapes. 
In practice this involved an interjudge agreement 
of 75% or above on each of the two aspects. 


Subjects 


The study included taped interviews of 65 clients, 
all seen at the University of Chicago Counseling 
and Psychotherapy Research Center by therapists 
with a Rogerian orientation, All the therapists in- 
volved had at least 2.5 years of experience, and 
most had much more than that. 

All of these clients were following the “availability 
pattern,” a variant of time-limited therapy intro- 
duced by Shlien (1957). Clients had 20 sessions, 2 
a week for 10 weeks. They then had a 10-week 
vacation from therapy after which they were free 
to resume for another 10 weeks with the same 
therapist. This pattern could be repeated as many 
times as desired. The findings to be reported were 
drawn from the first 20-interview block. 

The clients were representative of the population 
seen at the center at that time. There were 36 males 
and 29 females. The age range was from 18 to 56, 
with a median of 28. The median educational level 
was 4 years of college. The factor analyses were 
based on interviews of the 53 clients who completed 
the first block. The 12 clients who completed less 
than 10 interviews will be discussed below. 


Analysis 


Three interviews from each client were studied, 
the first, second, and eleventh, The 30 responses from 
each interview had been placed in one and only 
one of the four subclasses of each aspect, This 
yielded 16 (4X4) combination classes in which a 
given response might be classified. For each point 
in time a separate matrix of interview similarity 
was constructed. That is, the 53 first interviews 
were analyzed separately, and the same was true 
for the second and for the eleventh, Thus, there were 
three matrices, each with 53 rows and columns. 
The method of analysis used was one analogous to 
factor analysis and was developed by Butler, Rice, 
and Wagstaff (1963). In contrast to the more usual 
analysis which starts with a matrix of intercorrela- 
tions of scores, the present analysis involves a 
matrix, the entries of which represent on a scale 
between zero and unity the degree to which pairs 
of interviews share the same behaviors. After com- 
munality estimates have been added, each matrix is 
factored by the principal-axis method in the usual 
way. For each point in time four factors accounted 
for at least 95% of the sum of the latent roots, with 
succeeding factors accounting for small and approxi- 
mately equal proportions, Therefore, in each case four 
factors were rotated and interpreted. Rotation was 
carried out by the normal Varimax solution. i 

For each point in time the loadings of each inter- 
view were obtained on each of the four factors. 


560 


Each factor represented an interview type which 
could be described in terms of the client behaviors 
characterizing it. That is, an interview type is defined 
by the most frequently appearing behavior classes 
in interviews with high loadings on the factor in 
question and low loadings on the other factors. 


RESULTS 


The first three columns of Table 1 show 
the four factors or interview types for each 
point in time. For each of the three times 
sampled the first interview type (the first 
factor extracted) was characterized by the 
client’s use of an externalizing voice quality 
and an expressive stance of objective analysis 
and description. That is, the client seemed 
to be turning his energy outward, talking at 
the therapist, and viewing himself and his 
world as objects to be objectively analyzed 
or described. Previous research with an earlier 
version of this classification system (Butler, 
(Rice, & Wagstaff, 1962) suggested that 
most client-centered interviews are charac- 
terized by a certain amount of this behavior, 
but that a high concentration of it would be 
unfavorable for therapy. 

The second factor extracted for the first 
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interview matrix was also characterized by 
the expressive stance of objective analysis and 
description, but in this case the client was 
using a limited voice quality. For the second 
and eleventh interviews this was the third 
factor extracted. The prediction from earlier 
work is less clear-cut in this case, but it seems 
probable that this behavior would tend to be 
followed by poor or equivocal outcome. 

The third interview type emerging from the 
matrix of first interviews was characterized 
by a focused voice quality and an expressive 
stance about equally divided between objec- 
tive analysis and subjective reaction. In other 
words, the voice quality used suggests energy 
turned inward toward inner exploration. 
About half of the time was devoted to the 
client’s exploration of his own subjective re- 
actions to things impinging on him, with the 
other half spent in a more analytical or de- 
scriptive approach. Earlier research has sug- 
gested that these kinds of client behavior will 
probably be followed by a favorable outcome 
to therapy. It is interesting to note that the 
picture had changed from the first to the 
second interviews, and that in the second 


TABLE 1 
Cttent Process MEASURES IN RELATION TO CHANGE AFTER 20 INTERVIEWS 


Characteristic behavior 


Factor* i Outcome groups? 
Voice quality Expressive $ 
stance 
1st Interview 
Externalizing OA None 
IL Limited OA None 
III Focused SR or OA S>CH, S>At,* M>At, TH>At 
IV Externalizing SR S>CH 
2nd Interview 
I Externalizing OA At>S 
I Focused SR or OA S>M, S>CH,* S>At, TH>CH,* TH> At, M>At 
HI Limited OA M>S,* M>TH,* M>CH* 
IV Focused OA S>CH i 
11th Interview 
I Externalizing OA None 
II Focused OA S>M, S>CH,* TH>CH* 
Tl Limited OA None 
IV Focused SR or DE S>TH, S>M, S>CH 


Note.—Abbreviated: OA 
a In order of extraction, 


= objective analysis, SR = subjective reaction, and DE = differentiated exploration. 


b Discriminated at .05 level or beyond by Mann-Whitney U test. 


*p <.01, one-tailed test. 
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interviews these same behaviors, now repre- 
sented by Factor II, were accounting for a 
higher proportion of the total variance. Fur- 
thermore, in the eleventh interview matrix 
these two kinds of behavior were represented 
on separate factors. Factors II and IV were 
characterized by focused voice quality. Factor 
II, however, was characterized by an expres- 
sive stance of objective analysis, while IV was 
characterized by subjective reaction with a 
small amount of differentiated exploration. 

Finally, the fourth interview type for first 
interviews involved an externalizing voice 
quality and an expressive stance of subjective 
reaction. It seems probable that this combina- 
tion would be mildly favorable. (The fourth 
factor for the second interviews was not a 
clear one, but was roughly characterized by 
focused voice quality with more objective 
analysis than was picked up by the second 
factor.) 

Having isolated and described the fore- 
going interview types and spelled out the 
expectations derived from preliminary studies, 
the next step was to examine relationships be- 
tween these process variables and various 
outcome indexes. The study by Cartwright, 
Kirtner, and Fiske (1963) underlined the 
problem of selecting outcome criteria to which 
to anchor measures of the process of therapy. 
Inasmuch as outcome criteria assessed from 
different vantage points have proved to be 
only minimally related, each presumably con- 
taining its own “truth,” there seem to be some 
advantages in using combinations of at least 
two vantage points, forming subgroups from 
combinations of the two criterion measures. 
By forming groups high on both measures or 
low on both, we obtain fairly unequivocal 
“success” and “minimal change” groups. In 
addition, the two “mixed” groups can po- 
tentially yield interesting information not 
apparent from the two pure groups. 

The 53 clients included in the factor analy- 
sis were dichotomized first on the basis of 
favorable change in correlation between the 
client’s self and ideal-self sorts on the Butler- 
Haigh Q sort (Butler & Haigh, 1954) and, 
second, on the basis of therapist rating of 
favorable change. Both of these measures 
were taken after the first block of 20 inter- 
views. Group S was the unequivocal success 
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TABLE 2 
CHARACTERISTICS OF THE Five OUTCOME Groups 


Characteristic S 
Number 16 16 8 13 12 


Therapist change rating 6-9 | 6-9 | 1-5 | 1-5 | — 

Self—ideal-self correlation 2.20 | <.19 | >.20 | <.19 | — 
change cj 

No. male 6 10 5 8 7 

No. female 10 6 3 5 5 

Mdn age 27 26 28 |28 23 

Mdn years of education 16 16.5 16 14,5 |14.5 


Note.—Abbreviated: S = Success, TH = Mixed Group 1, 
SH fr Mixed Group 2, M = Minimal Change, and At = 
ttrition. 


group, the clients having changed .20 or more 
on their self—ideal-self correlations and having 
received a therapist change rating of 6-9 
on a 9-point scale. Group M was the pure 
minimal change group, the clients having 
changed .199 or less on their self—ideal-self 
correlations and having received a therapist 
change rating of 1-5 on a 9-point scale. The 
two mixed groups were Group TH (therapist 
rating high, client low) and Group CH 
(client high, therapist low). A fifth group 
was included, Group At, consisting of the 12 
clients who completed fewer than 10 inter- 
views and, therefore, classed as attrition 
clients. These 12 clients, for whom there were 
no posttherapy tests, were not included in 
the factor analysis described above. How- 
ever, it was possible to estimate their inter- 
view factor loadings and thus to include them 
as a fifth group. Table 2 summarizes the 
characteristics of these five groups, including 
certain actuarial material. There were no 
significant differences between groups on age, 
sex, or educational level. 

The last column in Table 1 shows the de- 
gree to which the different interview process 
types differentiate among the five outcome 
groups at the three different points in time. 
It is apparent that the 12 clients who quit 
therapy in less than 10 interviews show sty- 
listic differences as early as the first interview. 
They show significantly less Type III; (see 
Table 1) behavior than Groups S, TH, or 
even Group M. The only group from which 
they are not differentiated is CH. In fact, 
CH is similar to Group At in having signifi- 
cantly less of this Type III, behavior than 
Group S. That this lack of Type III, behav- 
ior in Group At is not simply a “closing off” 
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phenomenon is indicated by the fact that the 
median number of interviews for this group 
was five, and only one client had fewer than 
three interviews. At the time of the first 
interview a relative absence of focused voice 
quality accompanied by either objective 
analysis or subjective reaction is an indicator 
of either early termination or equivocal 
outcome. 

The second interview findings, shown in 
the last column of Table 1, enable us to make 
much more clear-cut predictions. The best 
predictor is still the appearance of focused 
voice quality, accompanied by both subjective 
reaction and objective analysis, now shown 
as Type IIs. Once again we can differentiate 
Group At from all other groups except Mixed 
Group CH. Group S is now significantly dif- 
ferent from M and from CH. The two mixed 
groups are now differentiated from each other, 
with TH higher than CH. The evidence 
strongly suggests that this is “productive” 
therapy process. The appearance of a sub- 
stantial amount of this favorable therapy 
behavior in the second interview is an indi- 
cator that the client will remain in therapy 
for 10 or more interviews and at the end 
of the first block of therapy will be seen by 
his therapist as having made marked gains. 

The second best predictor, this time in an 
unfavorable direction, is Type IITs behavior, 
characterized by limited voice quality and ob- 
jective analysis. This differentiates Group M 
from all except Group At beyond the .01 
level. This trend was present in the first 
interviews, but was not significant. The ap- 
pearance of a substantial amount of this 
behavior in the second interview suggests that 
neither the client nor the therapist will see 
much favorable change at the end of the 
first block of interviews. 

Factor I; behavior is still not a good 
predictor, making only one differentiation, 
Group S from Group At. The differentiation 
of S from Mixed Group CH just fails of sig- 
nificance because of the small size of Group 
CH. Factor IVs, although not very clear, 
does differentiate Group S from CH. 

Looking now at the eleventh interviews 
(Table 1), it is clear that the behavior we 
have characterized as “favorable therapy 
process” is now separated into two parts 
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being represented on separate factors. Inter- 
views with focused voice quality and objec- 
tive analysis are now represented on Factor 
II, while interviews that have focused voice 
quality but an expressive stance that is 
mostly subjective reaction with a small 
amount of differentiated exploration are repre- 
sented on Factor IV. Looking at the last 
column of Table 1, it is clear that while 
Type II,; behavior is characteristic of both 
Groups S and TH, Type IVıı behavior now 
differentiates Group S from all other groups, 
even from TH. The client whose voice quality 
is primarily focused while he is attending to 
the subjectivity of his own reactions and 
inner experiences seems to be engaged in an 
unequivocally productive process. 


Discussion 


The above analyses have shown that Group 
At could be differentiated from three of the 
other groups as early as the first interview. 
The two pure groups, Groups S and M, were 
clearly differentiated from each other by the 
second interview. However, this latter finding 
leaves unanswered some questions about rela- 
tionships between favorable process and the 
outcome criteria used. It could be accounted 
for by a relationship with only one of the 
outcome criteria, in this case therapist rating. 
If both therapist rating and self-ideal-self 
change were related to productive process in 
a simple linear fashion, then Groups S and M 
should be at the extremes with Groups TH 
and CH somewhere in between. Groups S, 
TH, and M are ordered in the expected way. 
Although Group S was not significantly dif- 
ferent from TH until the eleventh interview, 
it did show somewhat more favorable process 
in both the first and second interviews than 
did TH. For these two groups, both rated 
favorably by their therapists but differing on 
amount of self—ideal-self change, there seems 
to be a positive relationship between the 
amount of productive process in therapy and 
a rise in self-esteem as indicated by change 
in self—ideal-self correlations. 

Mixed Group CH, however, fell not in the 
middle, but below Group M in favorable proc- 
ess at each point in time. These two groups 
were both seen by their therapists as having 
made few gains, but the clients of Group CH 
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made large, sometimes strikingly large, gains 
in  self-ideal-self correlation. Apparently, 
clients showing the least favorable process 
have made the greatest gains in reported self- 
esteem. This relationship was tested explicitly 
by combining Groups M and CH and corre- 
lating favorable process on first and second 
interviews with favorable self-ideal-self 
change. At each time there was, indeed, a 
negative correlation which was significant at 
the .05 level for the second interviews. To- 
gether these results suggest that there is a 
relationship between client process and self— 
ideal-self change, but a complex one. In thera- 
pies considered by the therapist to have gone 
well self-ideal-self change is positively asso- 
ciated with good process. In therapies seen 
by the therapist as not very satisfactory the 
relationship is negative. One reasonable hy- 
pothesis concerning this marked shift in re- 
ported self-esteem for clients with extremely 
little favorable process is that it is a defen- 
sive move and that the vulnerability that led 
to entry into therapy has been closed off. In 
fact, only one of the clients in Group CH 
returned to therapy after the 10-week vaca- 
tion, a proportion significantly lower than 
that of the total sample. There is some 
further evidence for this hypothesis, based on 
Rorschach findings and on qualitative differ- 
ences in voice qualities, but fuller exploration 
must be reserved for a later paper. 

Turning now to more general issues, the 
evidence seems clear that the two aspects 
embodied in the present classification system 
do provide a satisfactory description of some 
of the crucial features of the client’s ongoing 
Process. The next step, of course, is to apply 
this tool to some of the complex questions in- 
volving the situation, the therapist, and the 
interaction. In a comparison of time-limited 
and unlimited therapy, we might tackle the 
question of depth or intensity. Are time- 
limited therapies perhaps setting more limited 
goals, or is the same depth of self-explora- 
tion achieved more quickly? In relation to 
moment-by-moment interaction, one might 
ask if therapist responses that are rated as 
being more stimulating in style are followed 
by shifts in the client’s level of process. 
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Could clients be directly trained to engage 
in more productive modes of participation 
without introducing artificial distortions? 
These are some of the questions currently 
under investigation. 
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The Manifest Anxiety (MA), Extraversion (E), Repression-Sensitization 
(R-S), and 13 other personality scales were given to 226 Ss. The purpose was 
to determine the extent to which MA scores are attributable to the influence 
of extraversion and to explore the degree to which the R-S scale assesses 
attributes similar to those measured by the MA and E scales. From the results 
of a factor analysis of the 16 variables the following conclusions were drawn: 
Scores on the MA and R-S scales are almost totally independent of E. The 
MA and R-S scales are practically identical in psychological meaning; scores 
on both these scales are largely determined by two bipolar, orthogonal traits— 


defensiveness and emotionality. 


Among the many response-defined variables 
which have been related to behavior, the 
Manifest Anxiety (MA; Taylor, 1953), Ex- 
traversion (E; Eysenck, 1957; Jensen, 1958), 
and Repression-Sensitization (R-S; Byrne, 
1961) scales have been the focus of much 
recent research, The empirical correlates of 
MA scores have generally been interpreted 
within the framework of Hull-Spence drive 
theory, those of E within a cortical-inhibition 
model, while those of the R-S scale have 
employed the concepts of ego-defense theory. 
The use of the MA and E scales as predictors 
of performance in a variety of experimental 
situations has given rise to a well-known 
theoretical controversy which has revolved 
around correlations between these scales, as 
well as their empirical relationships to other 
variables (e.g., Eysenck, 1965; Spence & 
Spence, 1964). Eysenck (1957), for example, 
has proposed that MA scores are partly deter- 
mined by the trait extraversion-introversion, 
and that the conditioning correlates of MA 
scores reflect the influence of this trait rather 
than anxiety (drive) level. Though correlations 
between MA and E have been reported (e.g., 
Jensen, 1958; Spence & Spence, 1964), the 
question of whether the correlation between 
these scales is attributable to the common 
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influence of extraversion-introversion or some 
other factor remains an open one. The pres- 
ent study, therefore, investigated the factorial 
structure of these two scales in order to 
determine the factors contributing to their 
relationship. 

Since both behavioral extraversion and anx- 
iety level have been viewed within the con- 
text of ego-defense theory as related to the 
R-S scale, the question of the degree to 
which the R-S scale assesses attributes similar 
to those measured by the MA and/or E 
scales is of potential theoretical significance. 
The present study was, therefore, also di- 
rected at determining the degree to which 
these scales measure similar attributes, with 
a view towards determining the extent (if 
any) to which the results of the growing 
body of research employing the R-S scale 
might be considered as consistent with 
Hull-Spence drive theory and/or Eysenck’s 
inhibition model. 


METHOD 
Subjects and Procedure 


Two specially constructed biographical inventories 
of 300 and 266 items which contained the scales 
described below were administered to undergraduate 
students at the University of Wisconsin, Milwaukee. 
The two scales were administered on 2 days sepa- 
rated by an interval of approximately 2 months. 
The only restriction on the selection of Ss was the 
exclusion of students taking advanced courses mM 
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psychology. The data of the 226 students who com- 
pleted both scales were machine scored for each 
of the 16 variables included in the analysis. 

The data were analyzed on the University of 
Wisconsin CDC 1604-G high-speed computer by 
Control Data Corporation Program G2 WISC 
IMAGE (Harris, 1962). The following options, of 
several permitted by the program, were chosen: a 
communality solution based initially on the squared 
multiple-correlation coefficient of each variable with 
all the remaining variables and then iterated by 
Rao’s method to a convergence criterion of .005 ; 
all factors corresponding to eigenvalues greater 
than 1.0 were retained and rotated to the normalized 
Varimax criterion (Kaiser, 1959). Harris (1962) 
reported that the number of factors retained by 
this method is usually about one-half the number 
of variables included in the analysis. Those factors 
worthy of minimal attention, as identified by the 
rotational procedure, are then discarded after 
rotation has been completed. 


Personality Scales 


In addition to the 3 scales of primary interest 
in the present study, 13 other scales were selected 
for inclusion in the biographical inventories. These 
scales were selected to provide a broad representa- 
tion of personality attributes in the factor analysis. 
Practical considerations made it necessary to alter 
the item content of the published versions of some 
of the scales; for example, in some cases buffer items 
were omitted. The intercorrelations obtained in the 
present study are highly consistent with those re- 
ported by other investigators, and it, therefore, seems 
reasonable to accept the results of the factor analysis 
as relevant to research using the published versions 
of these scales. 

Several scales which, for theoretical reasons, might 
have been included, were omitted from the matrix 
of variables. For example, though of potential inter- 
est, Edwards’ SD scale (Edwards, 1957) was omitted 
because the matrix included the MMPI K scale, 
which can be considered a useful substitute. Simi- 
larly, the Welsh A (first factor) scale might also 
have been included, but was not necessary because 
of its covariance with the MA scale. No direct 
Measure of psychoticism was included, but, rather, 
Barron’s Ego Strength scale (Es) was included as a 
substitute measure of ego disturbance. A 

The following is a brief description of the vari- 
ables used in the data analysis. Copies of the two 
biographical inventories are available upon request 
from the authors. 

Repression-Sensitization (R-S): A scale developed 
by Byrne (1961) to measure the individual’s con- 
sistent mode of ego defense; low scores indicate 
repression, while high scores indicate sensitization. 

Manifest Anxiety (MA): Taken from a scale pre- 
sented by Taylor (1953) which was constructed as 
a measure of manifest anxiety and used as an 
Operational definition of drive level. x 

Extraversion (E): A measure of behavioral extra- 
version-introversion developed by Eysenck which is 


included as part of the Maudsley Personality Inven- 
tory (MPI). The items were taken from the MPI 
as given by Jensen (1958). 

Neuroticism (N): A measure of the degree to 
which an individual shows clinically defined neurotic 
attributes, N is included as part of the MPI. The 
items were taken from the MPI as given by Jensen 
(1958). 

Agreeing Response Set (ARS): A measure of the 
tendency to agree or disagree with questionnaire 
items independent of the content of the item (Couch 
& Keniston, 1960). 

“Naysaying,” low F scale (F—): A five-item scale 
taken from Couch and Keniston (1960) which is 
presumably indicative of authoritarianism in indi- 
viduals who habitually disagree with strong positive 
statements, but agree with statements connoting 
authoritarianism when they are phrased in such a 
way as to emphasize qualification and a lack of 
exclamatory enthusiasm. According to Couch and 
Keniston this scale is highly and negatively cor- 
related (—.70) with the regular “positive” F scale. 
No data as to the reliability of this short scale were 
available. 

Positive F scale (F+): Five randomly selected 
items from the standard scale of authoritarianism 
(Adorno, Frenkel-Brunswik, Levinson, & Sanford, 
1950). No reliability data were available, but this 
scale was included because of a theoretical interest 
in it. 

Test Anxiety Scale (TAS): A scale taken from 
Sarason and Ganzer (1962) which represents the 
degree to which an individual describes himself as 
anxious and low in feelings of self-adequacy. The 
content of all the items is related to anxiety under 
test-taking conditions; Sarason and Ganzer, however, 
interpret TAS scores as indicative of generalized 
anxiety. 

Social Acquiescence Scale (SAS): A scale taken 
from Bass (1956) which measures the extent of 
agreement with a variety of statements concerning 
conformity to social expectation. 

Manifest Hostility Scale (MHS): A scale presented 
by Siegel (1956) which was designed to assess the 
degree of overt hostility an individual displays. 

Wesley Rigidity Scale (WRS), short form: A 
12-item version of the WRS (Wesley, 1953) de- 
veloped by Zelen and Levitt (1954) which is pur- 
ported to measure rigidity and to be related to 
authoritarianism. 

Ego Strength (Es): A scale developed by Barron 
(1954) to predict improvement in psychoneurotics 
undergoing psychotherapy and considered to be a 
measure of “ego strength.” 

MMPI K score (K): Among other things, the K 
score is presumed to be a subtle measure of an 
individual’s defensiveness with regard to admitting 
to psychological weaknesses (Dahlstrom & Welsh, 
1960). 

Prejudice scale (Pr): A scale taken from Gough 
(1951), who selected those items from the MMPI 
which showed a significant correlation with the 
Levinson-Sanford Anti-Semitism scale. 
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Social Status (St): A scale developed from MMPI 
items which showed a high correlation with an 
objective measure of socioeconomic status (Gough, 
1948). 

MMPI Lie scale (L): A measure of naive defen- 
siveness indicating the extent to which an individual 
chooses a response which presents himself in the 
most socially acceptable light (Dahlstrom & Welsh, 
1960). 

RESULTS 


Table 1 shows the means, standard devia- 
tions, communality estimates, and intercor- 
relations for the 16 variables. With the excep- 
tion of F+, all the variables show relatively 
good distribution characteristics. Though the 
mean of 1.5 and standard deviation of 1.0 
of the F+ scale indicated a fairly severe 
positive skew, this scale was included in the 
factor analysis because of a theoretical inter- 
est in the scale and a reliance on the robust- 
ness of the Pearson r (Cohen, 1965). 

The results of the factor analysis of the 16 
variables are shown in Table 2. Since the 
156-item R-S scale contains 17 of the 50 
items of the MA scale, two additional factor 
analyses, one with R-S eliminated and the 
other with MA eliminated, were computed in 
order to determine the effects of overlapping 
items on the rotated factor loadings. Table 2 
shows that MA loaded —.48, .61, and —.34 
on Factors I, II, and IV, respectively; in 
the factor analysis with R-S omitted, MA 
loaded —.58, .69, and —.29 on the cor- 
responding factors. Table 2 shows that R-S 
loaded —.59, .48, and —.38 on Factors I, II, 
and IV, respectively; in the factor analysis 
with MA left out, R-S loaded —.62, .49, and 
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—.35. Neither MA nor R-S showed an ap- 
preciable loading (.30 or higher) on any other 
factor in any of the three factor analyses 
computed. The rotated factor loadings and 
the correlation coefficient of .87* shown in 
Table 1, attest to the striking degree of 
equivalence between the MA and R-S scales. 
However, the inclusion of both MA and R-S 
in the same factor analysis did not produce 
any substantial changes in rotated factor 
loadings from those obtained when one or 
the other scale was omitted. 

Factor I, which accounts for the largest 
amount of common variance (25.0%) among 
the scales, is a bipolar factor showing high 
positive loadings for the MMPI L and K 
scales and high negative loadings for the R-S 
scale and the MHS. The MA, N, and Pr 
scales all show moderate loadings on Factor I. 
In addition, the ARS shows an appreci- 
able negative loading and the Es shows an 
appreciable positive loading on this factor. 

Factor II, which accounts for only slightly 
less (23.5%) of the common variance than 
does Factor I, shows high positive loadings 
for TAS, MA, and N and a high negative 
loading on Es. The R-S scale shows a moder- 
ate positive loading, the Pr and ARS show 
appreciable positive loadings, and the St scale 
shows an appreciable negative loading. It will 
be noted that many variables that show high 


4In order to check the reliability of this relation- 
ship, the MA and R-S scales were administered to 
353 additional subjects. The obtained correlation 
coefficient was .86, almost identical with that 
previously obtained. 


TABLE 1 

Means, STANDARD DEVIATIONS, CoMMUNALITY ESTIMATES, AND INTERCORRELATIONS 
Variable) RS MAE N., ARS F— F+ TAS SAS MHS WRS Es K Pr s&t L 
R-S 92 
MA :87* 87 
E =.39* —.36* 51 
N 75* O A+ —.37* 77 
ARS SOF 149 "11 52% 72 
F= mos —.09 Doo -112 ‘02.22 
F+ 28* 22 —15 21. 28* 10334 
TAS S7# 63" —.21 61 45% 02 124% 69 
SAS +31* 24" —05  .24* «58% 18 44+ lage 71 
MHS 59% SIF 11 .57* 53% —.22 25% <36% 24* 74 
WRS 20907 PE AE 08 AT As log nage 0 ka 
Es Z734 —.68* 274 — 56% —42* (01 —.28* —.56* —36% —.37* —16 75 
K 7.99% —.62% 21 —S5* —51* (09 —23 —.40* —.32* —.57* 14 44% 69 
Pr <70% 58* —.28% 55+ 48+ —i4 36+ 147+ 34+  65* 19 —.58* —.57* 17 
St — 40% —.42* 29% —41* —.23 —.13 —.21 —39+ —.26* —24* —22 50% 32% —'45* 52 
L =A —39* | 07 —38* —29% 06 —.04 —23 —106 —43* —07 37" 46% —38* (26% 50 
M |594 175 264 223 57 40 15 54 249 15.6 45 41.7 118 9.3 188 3.2 
SD 185 87 85 117 27 12 10 36 100 72 20.58 335.0 45. 4.1. 18 


Ne eatin values of h? have been inserted in the major diagonal; N = 226. 


*p <.00 
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TABLE 2 
Factor Loapincs or EACH VARIABLE ON THE OBTAINED FACTORS 


r Factor 
Variable j Communality 
I 1 I IV v VI vir | vir 

RS —.59 48 270 alienare 06 15, | 96 
MA —.48 ‘61 18 | —.34 ‘02 09 | —18 y oar 
E 04 | —16 | ~01 ‘67 15 | —.08 04 | —o1 ‘510 
N —.44 58 DD. AASTON Zip “19 04 | —.10 766 
ARS aise “30 69 =| —.04 ‘03 <05 03 | —07 721 
FA O5 RhE 02 <08 Oa ssl aae 00T O ‘215 
F+ <03 12 42 | E EEE) OIA 10 (339 
TAS —.16 a7 TAANA 08 | —.01 01 690 
SAS ERN 13 75 EOL); Boe igs ‘03 713 
MHS LSS 21 34 | —02 | —10 “52 02 | —03 738 
WRS —.06 | —01 aalis i) sad ell — 0. ‘01 ‘01 1213 
Es 35 (sl = SENA 19 21 104 50 00 751 
K 65: |) 24 w ah ens 15 LO PREIS ANE AES ‘692 
Pr AL 31 ta Gaited MES 42 | = l <05 173 
St (251 Hie a6 dS <24 48 ‘09 13 ‘09 ‘515 
L E oy 103 | —.02 09 | —05 09 “09 ‘498 
Variance 

Common] 25.0% | 23.5% | 18.0% | 10.1% | 9.0% | 81% | 4.2% | 21% 

Total | 15.5% | 145% | 111% | 63% | 5.6% | 5.0% | 26% | 1.3% 


loadings on Factor II also show high loadings 
on Factor I. However, there are three notable 
differences: The TAS is high on Factor II 
only; the Z and K scales are high on Factor I 
only. 

Factor III, the third most important factor 
(18% of the common variance), shows high 
positive loadings for both the SAS and the 
ARS. The F+, MHS, and Pr scales also 
show moderate to appreciable loadings on this 
factor. Factor III appears to be a measure 
of the tendency to agree with items inde- 
pendent of their content. Because the R-S, 
MA, and E scales failed to show at least any 
appreciable loadings on Factor III, the ex- 
traction of this factor, though of general 
interest, merely serves to indicate for the 
Purpose of the present study that the 
tendency to agree plays a minor or negligible 
role in determining scores on the R-S, MA, 
and E scales. 

Factor IV shows a high positive loading 
for the E scale and appreciable negative 
loadings for the R-S, MA, and N scales. It 
should be noted that the E scale, which 
defines Factor IV, does not load appreciably 
on any other factor. 

Factors V, VI, VII, and VIII are rela- 


tively minor factors resulting from the inclu- 
sion of the additional scales in order to clarify 
the relationships among the R-S, MA, and E 
scales, and their consideration is not relevant 
to the purposes of the present study. 


Discussion 


Table 1 shows a significant relationship 
(r = —.36) between the MA and E scales, 
which is in accord with the findings of other 
investigators (Jensen, 1958; Spence & Spence, 
1964). MA, it will be recalled, is largely 
determined by the traits reflected in Factors 
I, II, and IV, whereas E showed no appreci- 
able loadings on any factor other than Fac- 
tor IV. One, therefore, can logically conclude, 
as Eysenck (1957) has done, that MA scores 
may be related to the trait extraversion- 
introversion. The magnitude of the loading of 
MA on Factor IV, however, is not large 
(—.34) and, further, considering the nature 
of the factors on which MA has its major 
loadings (Factors I and II), it seems likely 
that MA is almost totally independent of 
extraversion-introversion. 

Turning next to a comparison between the 
R-S and E scales, the factorial relationship 
between these two scales is remarkably simi- 
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lar to the relationship between the MA and 
E scales described above. A similar conclusion 
is warranted here; that is, R-S scores are 
related to E, but the applicability of a 
cortical-inhibition model to studies using the 
R-S scale is questionable and requires re- 
course to experimental rather than correla- 
tional data. 

Though the relationships of R-S and MA 
with the E scale are in no sense definitive 
with regard to the contribution of extra- 
version-introversion to R-S and MA correlates, 
the high correlation between MA and R-S 
(r = .87) suggests that empirical relation- 
ships with MA or R-S may be equally inter- 
pretable within a drive or ego-defense theory. 
The fact that these two scales share approxi- 
mately 76% of their total variance in common 
makes such a conclusion highly plausible. A 
comparison of the distribution of factor load- 
ings of the MA and R-S scales further attests 
to their equivalence. Table 2 shows that both 
MA and R-S have their major loadings 
on Factors I and II and lesser loadings on 
Factor IV; this similarity suggests that their 
correlation of .87 is attenuated only by errors 
of measurement and not by their measuring 
different attributes. 

An interpretation of the psychological 
meaning of factors that are derived from 
scales which are themselves factorially com- 
plex raises at least as many questions as it 
attempts to answer. Nevertheless, two tenta- 
tive conclusions can be drawn from the pres- 
ent data about the potential meanings of 
Factors I and II and may serve as a basis 
for further research. Factor I is largely de- 
fined by scores on the K and ZL scales of the 
MMPI, and, it will be further recalled, 
neither of these scales has any appreciable 
loading on Factor IT. An examination of the 
other loadings within Factor I indicates that 
it is related to the absence of expressed hos- 
tility (MHS), manifest anxiety (MA) , neu- 
roticism (N), and prejudice (Pr); the ten- 
dency to repress (R-S); and the tendency to 
disagree (ARS). The general picture sug- 
gested is one of an individual who is unwilling 
or unable to admit to psychological weakness; 
these relationships in combination with the 
high loadings of K and L indicate that Fac- 
tor I can most reasonably be considered either 
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as a measure of the trait “defensiveness” or 
the method factor “social desirability” 
(Edwards, 1957). Because of the apparent 
complexity of social desirability (Megargee, 
1966; Messick, 1960; Wiggens, 1966) and 
the questionable empirical basis of this con- 
struct (Rorer, 1965), our tentative preference 
is to label Factor I “Defensiveness.” 

Factor II is largely defined by scores on 
the TAS, a measure of anxiety under the 
stress of examinations; it will be further 
noted that TAS has no appreciable loading 
on Factor I, Further inspection of Table 2 
shows that Factor II is highly related to neu- 
roticism (N), the presence of manifest anx- 
iety (MA), low ego strength (Es), and the 
absence of the tendency to repress (R-S). 
Though other interpretations are possible, 
Factor II can probably best be considered as 
a measure of “anxiety proneness” or “emo- 
tionality.” From the preceding it can be hy- 
pothesized that scores on the MA and R-S 
scales are determined by some combination 
of the orthogonal, bipolar traits of defen- 
siveness and emotionality. Some support for 
this hypothesis is provided by a recent study 
by Cohen (1967), who investigated both the 
defensive and emotional reactions to a stressor 
of repressors and sensitizers; he found that 
both defensive behavior and emotionality as 
measured by the GSR were a function of 
scores on the R-S scale, but that these two 
responses were unrelated to each other. 

As previously indicated, empirical correlates 
of R-S have been interpreted within the con- 
text of ego-defense theory, that is, as related 
to those attributes presumably measured by 
Factor I; similarly, correlates of MA have 
been seen as a function of drive, which it 
seems reasonable to hypothesize is reflected 
in Factor II, Emotionality (cf. Spence, 1958). 
Though R-S does have a slightly higher 
loading on Factor I and MA on Factor II, the 
difference between these loadings, for all prac- 
tical purposes, is negligible. 

Several logical conclusions are easily de- 
ducible from the proposition that R-S and 
MA are equivalent and largely determined by 
the two orthogonal traits defensiveness and 
emotionality. For example, a sensitizer might 
indicate one who is “emotional” or simply 
one who is not “defensive” or some inter- 


Manirest Anxiety, EXTRAVERSION, AND REPRESSION-SENSITIZATION SCALES 56! 


mediate combination of these traits. Con- 
versely, a score in the repressor direction 
might indicate an absence of emotionality 
or the presence of defensiveness, Therefore, 
differences between repressors and sensitizers 
are potentially attributable to the influence of 
defensiveness as well as emotionality. Similar 
problems with respect to the interpretation of 
empirical differences between high and low 
MA subjects can also be logically deduced. 
Though problems of theoretical interpreta- 
tion might be mitigated by a consideration 
of the nature of the task variables employed 
in any particular experiment, there seems to 
be a clear need for improved response- 
defined measures of drive level and repression- 
sensitization to clarify the theoretical inter- 
pretation of empirical relationships with these 
scales, 

In summary, this study showed that the 
MA and R-S scales are practically identical 
in psychological meaning, both of them being 
largely determined by two bipolar, orthogo- 
nal traits, defensiveness and emotionality. 
Further, scores on the MA and R-S scales 
were found to be almost totally independent 
of E, so that there was no convincing evi- 
dence that the empirical correlates of MA and 
R-S are a function of extraversion-introver- 
sion; the question of the applicability of a 
cortical-inhibition model towards explaining 
the empirical correlates of MA and R-S 
remains an open one. 
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2 groups of retardates were matched for intellectual functioning and chrono- 
logical age at the time of institutionalization, diagnosis, and sex. The Environ- 
mental Enrichment (E) group was enrolled in special education classes; the 
Environmental Deprivation (D) group was not. After 4 yr. of these respective 
treatments, group differences in cognitive functioning were attributable to 
impairments in Group D rather than increments in Group E; individual differ- 
ences in cognitive functioning within these groups were attributable to initial 
intelligence level, diagnosis, and amount of parental contact. In the absence of 
special educational treatment, decrements in retardate cognitive functioning 
were proportional to length of institutionalization. These results were in- 
terpreted as reflecting the interplay of learning, social, and motivational fac- 


tors in influencing retardate cognition. 


The effect of institutionalization on the cog- 
nitive development of retardates is complex 
and not well understood. Institutionalization 
of retardates has variously been reported to 
enhance cognitive development (Crissey, 
1937; Kephart, 1940), to debilitate it (Cris- 
sey, 1937; Rushton & Stockwin, 1963; Sayegh 
& Dennis, 1965; Spitz, 1949), and to have no 
effect upon it (Alper & Horne, 1959; Holo- 
winsky, 1962). It seems reasonable to con- 
clude from the available literature that cogni- 
tive functioning will be enhanced if an insti- 
tution provides environmental enrichment and 
nurtural support, while cognitive functioning 
will be debilitated if the institutional environ- 
ment is barren and unstimulating (Clarke & 
Clarke, 1954; Crissey, 1937; Kephart, 1940; 
Spitz, 1949), 

Paradoxically, this apparently self-evident 
generalization is not easy to defend. The best 
controlled of the environmental enrichment 
studies report widespread individual differ- 
ences among retardates in the degree of cog- 
nitive development which they evidence in re- 
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sponse to “enrichment” conditions (Clarke & 
Clarke, 1954; Kirk, 1958). The effect of 
environmental impoverishment seems to be 
somewhat more uniform, being generally de- 
bilitating to a greater or lesser degree (Spitz, 
1949), although individual variation is ap- 
parent here also (Rushton & Stockwin, 1963). 

The current report will be concerned, first, 
with the effects of environmental enrichment 
versus environmental deprivation in an insti- 
tution for retardates. Environmental enrich- 
ment is defined as exposure to a program of 
formal education and environmental depriva- 
tion as an absence of exposure to formal edu- 
cation. Admittedly, these concepts character- 
istically have implications beyond the pres- 
ence or absence of schooling. We justify our 
usage on the grounds that in an institution, 
at least, inclusion in a classroom program 
provides more than intellectual stimulation. 
The retardates are also supplied with richer 
personal interactions and experiences than 
would otherwise be available, so that the 
total psychosocial milieu is altered. 

Our second aim is to identify, on the one 
hand, the variables which are associated with 
accelerated cognitive development of mental 
retardates in response to environmental stim- 
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ulation and, on the other hand, the variables 
which are associated with the ability of insti- 
tutionalized retardates to withstand cognitive 
impairment under conditions of environmental 
deprivation. 


METHOD 


Environmental Enrichment and 
Environmental Deprivation 


The data for this report were obtained from the 
records of the Wrentham State School in Wrentham, 
Massachusetts, a training institution for the mentally 
retarded. 

Environmental enrichment was provided by enroll- 
ment in the Karl Quinn School, an organization on 
the grounds of the parent institution, the Wrentham 
State School. The Karl Quinn School is responsible 
for providing a program of formal education for 
inmates of the Wrentham State School. Conversely, 
environmental deprivation was defined as nonenroll- 
ment in the Karl Quinn School. 

The Karl Quinn School, which enrolls about 25% 
of the inmates of Wrentham State School, is a pro- 
gressive, modern training facility which provides 
classes from prekindergarten through the seventh 
grade. All teaching staff are required to possess a 
BA degree, with certification in teaching and special 
education. Classes have no more than 10 students 
and meet for one-half day on every weekday. 

At the time of data collection, pupil selection for 
Karl Quinn was unsystematic and based upon physi- 
cians’ impressions that a given retardate was well- 
behaved or verbal enough to benefit from class. The 
policy of maintaining small classes meant that many 
were excluded who otherwise were eligible. No sys- 
tematic program existed for retardates not chosen for 
Karl Quinn, and many went unoccupied, although 
older persons might be assigned to routine jobs in 


the institution, 


Procedure 


Effect of the educational program. The basic pro- 
cedure for ascertaining the effect of the educational 
program of the Karl Quinn School involved a com- 
Parison of the test-retest Stanford-Binet scores of 
two matched groups of retardates: a group of 48 
retardates who had been exposed to the program of 
the school (Environmental Enrichment, Group E) 
Versus a group of 48 retardates who had not been 
so exposed (Environmental Deprivation, Group D). 
The writers are well aware of the literature (Mas- 
land, Sarason, & Gladwin, 1958) describing the 
limitations of intelligence tests for use with retarded 
Persons. Even granting that intelligence test scores 
of retardates may serve only as an indication of 
performance on a specialized series of cognitive tasks 
at one point in time, it was, nevertheless, felt that 
the results of such a procedure deserve to command 
interest. The retardates in Groups D and E had been 
individually matched as closely as possible for initial 
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IQ, CA, and MA at the time of institutionalization 
at Wrentham State School; for sex; for diagnosis; 
and for the period of time elapsing between the 
initial IQ test and the IQ retest. These data are 
presented in Table 1. 

Individual differences in response to enrichment 
and deprivation. Little work has been done to iden- 
tify the variables which are associated with differ- 
ences among institutionalized retardates in their 
relative ability to benefit from special education on 
the one hand or to withstand the effects of environ- 
mental deprivation on the other. However, it seemed 
reasonable to expect that both of these abilities, as 
measured, respectively, by IQ increments in Group 
E and the absence of IQ decrements in Group D, 
would be positively correlated with such factors as: 
(a) the number of years a retardate was successfully 
maintained in his home environment before institu- 
tionalization was required, as indicated by his 
chronological age upon institutionalization; (b) 
relatively intact cognitive ability, as indicated by IQ 
upon initial hospitalization; (c) nonorganic as op- 
posed to organic etiology of retardation; (d) a 
stimulating home environment prior to institutionali- 
zation, as measured by such indicators of socioeco- 
nomic status as paternal education and paternal 
occupation; and (e) maintainance of interest in the 
institutionalized retardate by his family, as indicated 
by the number of visits received and the number of 
days spent on visit to the parental home during the 
year of retesting. It was also expected that IQ incre- 
ments and decrements would be in proportion to 
the amount of time spent in enrichment or depriva- 
tion conditions, respectively. 

Year-by-year effect of environmental deprivation. 
Originally, it had been expected that we would be 
able to observe the year-by-year effect of environ- 
mental deprivation upon institutionalized retardates. 
However, given the relatively small N of Group D, 
little opportunity was provided for a systematic 
study of the year-by-year effects of institutionaliza- 


TABLE 1 


COMPARISON OF THE ENVIRONMENTAL ENRICHMENT 
AND ENVIRONMENTAL DEPRIVATION GROUPS 


Item Group E Group D 

M admission IQ 43.1 = 13.08 | 44.8 = 12,25 
M admission MA (in yr.) 3.42 + 2.58 | 3.33 + 2.45 
M admission CA (in yr.) 7.12 + 5.34) 7.50 £ 5.48 
Sex 40 

Male 32 

Female 16 18 
Diageo 1 26 

Chronic brain syndrome 2 4 

Mongol 11 n 

Familial 9 3 

Undifferentiated 7 A 

No diagnosis 
M test-retest interval (in yr.) | 3.80 + 3.16| 3.80 + 2.26 


Note.—N = 48 in each group. 
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tion upon intelligence test scores. Consequently, a 
third group of 214 retardates (Group F) was 
studied. Group F members had received no formal 
educational training, but intelligence had been tested 
upon initial institutionalization and had been re- 
tested after periods of institutionalization ranging 
up to 9 years. These retardates were selected so that 
a sizable subsample had been retested after 1 year of 
institutionalization, another subsample after 2 years, 
etc., up to 9 years of institutionalization. The mean 
CA and IQ scores for each subsample at the time of 
initial institutionalization are shown in Table 3. 
Analysis of variance indicated no significant differ- 
ences in CA or IQ among the subgroups at time of 
initial institutionalization. 


RESULTS 


Differential Effects of Environmental 
Enrichment and Environmental Deprivation 


Upon retest after an average of 3.8 years at 
Karl Quinn School, Group E showed a mean 
gain of 1.40 IQ points (t= 1.40, p < .20). 
After a corresponding period of no educa- 
tional experience, Group D showed a mean loss 
of 9.65 points (¢ = 7.16, p < .001). The final 
difference of 11.05 points between Groups 
D and E was found to be statistically signifi- 
cant, using a correlated ¢ test of the differ- 
ence between the difference scores of Groups 
E and D (¢ = 5.76, p < .001). Thus, the sig- 
nificant difference between the two groups 
was attributable to an IQ loss in Group D, 
rather than a rise in Group E. 

The distribution of IQ gains and losses for 
Groups D and E is presented in Table 2. It 
is apparent that retardates who evidence a 
rise of 10 or more IQ points are much more 
common in Group E than in Group D (8 re- 
tardates in Group E versus 1 retardate in 
Group D); on the other hand, retardates who 
evidence drops of greater than 10 IQ points 
are more common in Group D than in Group 
E (21 retardates in Group D versus 4 in 
Group E). 


Individual Differences in Response to 
Environmental Enrichment 


Of all the variables under consideration, 
only the initial IQ score was related to IQ 
loss or gain under conditions of environmental 
enrichment, and this was in the direction op- 
posite to that hypothesized. When Group E 
was divided at the median into retardates 
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TABLE 2 


COMPARISON OF THE DISTRIBUTION OF Ss EVIDENCING 
IQ Gams AND Losses IN ENVIRONMENTAL 
DEPRIVATION AND ENVIRONMENTAL 
ENRICHMENT GROUPS 


IQ Change Group E Group D 


Gain 
20-24 
15-19 
10-14 

5-9 
0-4 


ROPNN 
neeo 


Loss 
1-5 
6-10 

11-15 
16-20 
21-25 
26-30 


SCSORWAS 


Note.—N = 48 in each group. 


with a low initial IQ (42 or less) and retard- 
ates with a high initial IQ (43 or more), it 
was found that the low-IQ group gained an 
average of 6.21 IQ points more than did the 
high-IQ group (t = 2.54, p < .02). 


Individual Differences in Response to 
Environmental Deprivation 


Two of the experimental hypotheses were 
confirmed in regard to Group D: When the 
retardates were divided at the median on the 
basis of the number of visits they had to their 
parental homes during the year of their re- 
testing, 20 retardates who had one or more 
visits to their parental homes evidenced an IQ 
loss of only 6.35 points, as contrasted with a 
loss of 12.64 IQ points for 28 retardates who 
never once visited their parental homes (t = 
2.12, p < .05). 

It was also found that mongols of Group 
D evidenced a decline of 14.1 IQ points, as 
contrasted with a decline of 2.2 IQ points for 
familial retardates of Group D (¢ = 2.27, p 
<.05). Thus, retardates with an organic 
etiology lost more than retardates with a non- 
organic etiology. This last finding must ob- 
viously be regarded with caution, since only 
a very small number of persons was involved 
(5 familials and 10 mongols). 
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TABLE 3 


YEAR-BY-YEAR Errects or LONG-TERM ENVIRONMENTAL DEPRIVATION ON THE 
IQ Scores or 214 INSTITUTIONALIZED RETARDATES 


Retardates retested aft i 
EEE tee Retardates with IQ | Retardates with IQ 
tionalization, with initial gain or loss from gain or loss > Percentage of re- 
Years of successive TQ and CAs M IQ decline iaLieyer ab ppoints tardates retested 
institutionalization upon retest each year with 
— eee gain or loss > 
$ 10 points 
N 1Q cA Gain | Loss | No e| Gain Loss 
1 70 444+23|/ 838 — 0.72 25 42 3 11 8 27 
2 59 42 +18 | 8+7 — 3.46 20 32 7 5 13 31 
3 54 44 +18| 74 — 5.46 10 41 3 1 12 24 
4 St 42414] 835 — 643 9 39 3 0 10 20 
5 42 45417] 847 — 7.69 4 36 2 1 14 36 
6 31 46 +14 | 8+5 — 8.35 6 24 1 4 15 61 
7 20 45 +11 76 = 855 1 18 1 0 6 30 
8 24 | 43414/746| — 673 4 | 19 1 2 10 50 
9 4 ate | 73) -1042 o} 141] 0 0 6 42 


* While the data from 214 retardates are represented in this table, the entries here total 365, since many of the retardates were 
retested more than once (e.g., a given retardate may have been retested at the second and fifth year after institutionalization), 


Year-by-Vear Effects of Environmental 
Deprivation 


The data for the Group F showing the 
year-by-year effects of environmental depriva- 
tion are presented in Table 3. There is a 
steady year-by-year decline that is unbroken 
except for the eighth year after institutionali- 
zation, 

However, it is apparent from Table 3 that 
there are wide individual differences in IQ 
scores, For the first 2 years following institu- 
tionalization substantial numbers of retardates 
show IQ gains. For the first year, in fact, 
more retardates evidence large gains (of 10 
or more points) than evidence large losses. 
However, for the first 4 years after institu- 
tionalization, 70-80% of those retested show 
no large (10 or more IQ points) gain or loss. 

As may be seen from Table 3, most of the 
retardates who evidenced large IQ gains fol- 
lowing institutionalization did so in the first 2 
years after admission. An attempt was made 
to ascertain what variables differentiated re- 
tardates who gained 10 or more IQ points in 
the first 2 years following institutionalization 
from those who lost 10 or more in the first 2 
years, Consequently, all retardates who had 
shown IQ gains of 10 or more points following 
admission (W = 15) were compared with all 
Tetardates who had evidenced losses of 10 or 
More points in the 2 years following admis- 
sion (V = 20) in regard to: (a) IQ upon ini- 
tial institutionalization, (6) chronological age 
at time of institutionalization, (c) etiology of 


retardation, (d) sex, (e) paternal education, 
(f) paternal occupation, (g) number of pa- 
rental visits received during the year in which 
they were retested, (4) number of days spent 
on visit to the parental home during the year 
in which they were retested. 

Only one of the above comparisons be- 
tween the gain and loss groups was statisti- 
cally significant: The mean IQ upon initial 
institutionalization of the gain group was 29, 
while the mean initial IQ of the loss group 
was 52 (t = 4.33, p < 001). Thus, retardates 
who evidence large IQ gains in response to 
institutionalization tend to evidence a low ini- 
tial IQ, while those who evidence large IQ 
losses in response to institutionalization tend 
to evidence a high initial IQ. 


Discussion 


As expected, conditions of environmental 
enrichment and environmental deprivation 
produce differential effects on the two matched 
groups of retardates. The decrement in IQ 
scores of the Environmental Deprivation 
group is consonant with earlier work (Crissey, 
1937; Rushton & Stockwin, 1963; Sayegh & 
Dennis, 1965; Spitz, 1949). However, the 
fact that no significant mean change was ob- 
served in the Environmental Enrichment 
group was unexpected, since Kephart (1940) 
and Kirk (1958) reported IQ increments in 
institutionalized retardates exposed to special 
educational programs. The answer may reside 
in sample differences, since it appears that 
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Group E was a relatively heterogeneous, yet 
generally low-grade group, compared to the 
retardates employed by Kephart (1940) and 
Kirk (1958). 

It would, however, be incorrect to conclude 
from Group E’s unchanged mean score that 
environmental enrichment had no enhancing 
effect at all upon IQ, since (as Table 2 dem- 
onstrates) more members of Group E evi- 
denced large IQ gains and avoided large IQ 
losses than was the case with Group D. 

Large individual differences in response to 
environmental conditions are apparent both 
within Group E and Group D. However, the 
data seem to suggest that IQ change under 
conditions of environmental enrichment is not 
a function of the same factors that produce 
IQ change under conditions of environmental 
deprivation. 


Individual Differences in Intelligence Test 
Performance as a Function of Environmental 
Stimulation 


Under conditions of environmental enrich- 
ment, retardates with high initial IQs show 
less gain than retardates with low initial IQs. 
Statistically, the tendency for there to be, 
upon retest, a diminished range between the 
highest and lowest scoring members of a 
group is known as a regression effect. How- 
ever, the fact that no similar effect was found 
in Group D suggests that other factors may 
be operating in addition to, or instead of, a 
simple satistical regression effect. 

We believe this finding may imply that the 
program at Karl Quinn School is pitched at 
a level which provides a more intellectually 
stimulating experience for retardates who are 
initially operating at a low rather than a high 
level. The program necessarily provides a 
norm for learning which the majority of re- 
tardates in the class might realistically be 
expected to achieve. A mean intellectual level 
is reinforced around which the IQ scores of 
all of the retardates in the group would tend to 
cluster; that is, those retardates who initially 
had the lowest scores would be most stimu- 
lated and would evidence the largest IQ incre- 
ment, while those who initially evidenced the 
highest scores would be relatively less stim- 
ulated and would tend to fall back towards 
the group norm. 
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Individual Differences in Intelligence Test 
Performances as a Function of Environmental 
Deprivation 

In terms of the above argument, it is un- 
derstandable that initial IQ does not relate to 
IQ change in Group D, since Group D re- 
tardates are not provided with a systematic 
background of environmental stimulation 
which may serve to reinforce a common level 
of adaptation. However, the data from Group 
D provide evidence which suggests that in the 
absence of a structured classroom background 
to supply a level of intellectual stimulation, 
social or motivational factors become of prime 
importance in determining IQ. In Group D, 
retardates who enjoyed continued parental 
interest, as evidenced by parental willingness 
to accept the retardates on home visits, showed 
relatively small IQ losses as contrasted with 
retardates whose parents did not accept their 
children home on visit. Parental interest may 
serve to enhance retardate performance in a 
testing situation by elevating the retardate’s 
motivation and interest in his environment. In 
a similar vein, the work of Zigler and his col- 
leagues (Butterfield & Zigler, 1965; Shepps & 
Zigler, 1962; Zigler, 1961, 1963; Zigler & 
Williams, 1963) demonstrates that social in- 
teraction may serve as a motivating force in 
determining retardate performance in a test 
situation. 

An alternative explanation might be offered 
to the effect that intellectual stimulation re- 
ceived in the parental home may fill the void 
created by the absence of formal educational 
experiences in the institution. This seems to 
be unlikely, given the small mean number of 
days (27) which retardates on home visit 
actually spent in the parental home. Å 

In regard to the relationship between etl- 
ology of retardation and IQ change under 
conditions of environmental deprivation, it 
was found that the intelligence test perform- 
ance of retardates with a (nonorganic) diag- 
nosis of familial retardation was less impaired 
than was the intelligence test performance of 
retardates with a (organic) diagnosis of 
mongolism. While the number of subjects 10- 
volved in this analysis was admittedly small, 
the finding is lent some additional credence 
by the fact that similar relationships have 
been reported by Strauss and Kephart (1939) 
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and by Miyamoto (1960). No similar relation- 
ship appeared under conditions of environ- 
mental enrichment. Consequently, it seems 
reasonable to conclude tentatively that eti- 
ology, similar to the variable of parental in- 
terest, becomes most important as a determi- 
nant of IQ under conditions of environmental 
deprivation, where there is absence of any 
direct support of some level of intellectual 
performance. This may suggest that those 
forms of retardation which are thought to be 
most debilitating, such as mongolism, are in 
fact most crippling when no attempt is made 
to compensate for the disability through a 
formal educational program. 


VYear-by-Year Effect of Institutionalization 
upon IQ Scores 


The findings in Table 3 suggest that insti- 
tutionalization in the absence of educational 
support tends to have a debilitating effect 
proportional to length of institutionalization. 
In comparable studies Rushton and Stockwin 
(1963) have reported a similar degree of IQ 
decrement as a function of number of years 
of institutionalization, while Alper and Horne 
(1959) reported no marked detrimental effects 
of institutionalization upon 1Q. 

The variation in IQ scores in Group F dur- 
ing the first 2 years of institutionalization is 
strongly related to initial IQ: Retardates who 
lose 10 or more IQ points initially evidence a 
high IQ upon institutionalization, while re- 
tardates who gain 10 or more points initially 
evidence a low IQ. Superficially, these results 
seem to contradict our earlier mentioned find- 
ing, since no relationship between initial 
intelligence level and IQ change occurred in 
Group D, which, of course, also experienced 
deprivation. However, there is, in fact, no 
contradiction; the relationship in Group F 
involves only retardates retested during the 
first 2 years of institutionalization, of whom 
there were very few in Group D, and the 
telationship in Group F involves only extreme 
Cases (those who gained or lost more than 10 
IQ points). In Group D the analysis involved 
the entire sample because there were too few 
subjects to employ only extreme cases. Only 
one retardate in Group D showed a gain of as 
much as 10 IQ points. 


575 


The fact that in Group F large IQ gains 
are posted by persons with initially low IQ 
scores while large losses are evidenced by 
those with initially high IQ scores, may be a 
statistical artifact (i.e. a regression effect). 
However, the finding may be open to the 
same interpretation as a similar finding in 
Group E: An institution for the retarded, like 
a classroom, may reinforce a level of intel- 
lectual adaptation that is suitable for the 
median members of the group, that is, one 
that is neither too advanced for the duller 
members nor too elementary for the brighter 
members. Consequently, IQ scores will tend 
to cluster around that median level of adapta- 
tion. 

The difference between the institution as a 
whole and the classroom situation is that in 
the small homogeneous setting of the class- 
room there is constant reinforcement of a 
given level of intellectual adaptation, while in 
the larger heterogeneous setting of the institu- 
tion such reinforcement is necessarily more 
diffuse. Consequently, for a class at Karl 
Quinn School a tendency toward a clustering 
of IQ scores may become apparent in the test 
results of the class as a unit, while for a group 
of retardates whose only “classroom” is the 
larger setting of the institution itself this 
effect will not become apparent in a small 
random sample (such as Group D) but only 
in those members of the institution whose 
scores are on the extremes of the distribution 
(i.e., those whose initial scores are either very 
high or very low). 

It should be noted, finally, that the finding 
that the greatest IQ gains occur in retardates 
with the lowest initial IQs is in accord with a 
similar report by Clarke and Clarke (1954), 
but stands in contradiction to a report by 
Holowinsky (1962). 

In conclusion, failure to provide formal 
training experiences for institutionalized re- 
tardates plainly has a deleterious effect upon 
IQ. While many of the retardates in Group E 
would be considered too low-grade by conven- 
tional standards to benefit from special class- 
room education (as evidenced by the fact 
that the initial IQ scores of half of the group 
were 40 or less), our data suggest that it was 
these very retardates who may have benefited 
most. The implication may be drawn that 
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many retardates who have been classified as 
“uneducable” on the basis of a low IQ score 
may, in fact, benefit from classroom experi- 
ence. 

The question may be raised of the effect of 
“enrichment” and “deprivation” upon the 
adaptive behavior, rather than upon the IQ 
scores, of these same subjects. Data are cur- 
rently being collected which bear on this prob- 
lem. 
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TEST ANXIETY, STRESS, AND VERBAL BEHAVIOR 


MURRAY MEISELS1 


Eastern Michigan University 


2 levels of S anxiety and 2 levels of experimentally induced stress were used 
to study the following verbal indexes of transitory anxiety: a content mea- 
sure, the type-token ratio, verb-adjective ratio, speech disturbance ratios, and 
intrusions, High- and low-test-anxious Ss were placed in high- and low-stress 
conditions, and responses given to 10 TAT cards were scored on each verbal 
category. Findings were: content and, to a lesser extent, the verb-adjective 
ratio varied with anxiety-stress group; type-token ratios were unrelated to 
anxiety-stress condition; and speech disruption and intrusion ratios decreased 
under conditions of maximal anxiety arousal. Results have implications for 
test-anxiety theory, for the validity of the verbal measures, and for the 


representational versus instrumental models of verbal communication. 


The study of verbal anxiety indexes has 
received increasing attention recently, though 
the majority of studies have been correlational 
in nature (Mahl & Schulze, 1962) and valid- 
ity has not been adequately established. Fur- 
ther, the few studies which have employed 
experimental manipulations of stress have 
typically been concerned with only one mea- 
sure, speech disturbance (Mahl, 1959). The 
major purpose of the present research was to 
assess the validity of five diverse measures of 
transitory anxiety, measures ranging from 
those which focus on lexical content to those 
which deemphasize content and stress the 
nonlexical “process” of speaking. 

The general hypothesis under investigation 
was that verbal indicators of transitory anxi- 
ety vary as a function of the subject’s char- 
acteristic anxiety level, as measured by a test, 
and degree of experimentally induced stress. 
A testlike stress situation was chosen which 
appeared to have realistic anxiety-inducing 
Properties for college student subjects. Two 
levels of subject anxiety and two levels of 
Stress were used, and significant main and 
interaction effects were predicted for each of 
the verbal measures. These predictions are 
Consistent with test-anxiety theory, which 


1 This study is based on a dissertation submitted to 
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posits that it is only under stressful, but not 
neutral, conditions that behavioral differences 
obtain between high- and low-anxious groups 
(Sarason, Davidson, Lighthall, Waite, & Rue- 
bush, 1960). 

The five indexes of anxiety in verbal be- 
havior, with associated rationales, are pre- 
sented below. Significant main and interac- 
tion effects were predicted for each measure. 

1. Content. On the basis of psychoanalytic 
theory and clinical experience, Gottschalk and 
his co-workers (Gleser, Gottschalk, & Springer, 
1961) developed a content measure (here 
labeled the Content Anxiety Scale) which 
classified anxiety into six subtypes: death, 
mutilation, separation, guilt, shame, and dif- 
fuse (or nonspecific). The working assumption 
is that reference to any of these themes reflects 
underlying anxiety. 

2. Type-token ratio. The type-token ratio is 
defined as the relationship between the num- 
ber of different words (types) over the total 
number of words (tokens) in a sample of 
language. Segmental ratios, the number of 
different words used in each of many small 
segments (e.g., 25-word units) of the total 
verbal production, are usually studied. Essen- 
tially, this is a measure of breadth of vocab- 
ulary, which can be considered as one aspect 
of a well-organized response and which should 
vary inversely with anxiety level (Mahl & 
Schulze, 1962). 

3. Verb-adjective ratio. The verb-adjective 
ratio is defined as the number of verbs over 
the number of adjectives in a sample of lan- 
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guage. It is considered that high values con- 
note tension and anxiety (Balken & Masser- 
man, 1940). 

4. Speech disturbance ratios (SDRs). Mahl 
(1959, 1961) elaborated two SDR scales, the 
“Ah” ratio, consisting of Ah and its variants, 
and the non-Ah ratio, consisting of sentence 
change, repetition, stutter, omission, sentence 
incompletion, tongue slip, and intruding inco- 
herent sound. These ratios are defined as the 
number of nonfluencies over the total number 
of words in a sample of language. Mahl con- 
siders that disruptions in the flow of speech 
are consequences of anxiety in the same way 
that disruption of any fine motor coordination 
can be attributed to anxiety. He further con- 
siders that “process” analysis (SDRs) should 
be a more useful indicator of anxiety than 
analysis of lexical content, since the speaking 
process frequently lies outside the individual’s 
awareness and, hence, is less subject to volun- 
tary control. 

5. Intrusions. Intrusions (e.g., coughs and 
laughs) are also disruptions in the flow of 
speech, and the same rationale applies to them 
as applies to SDRs. The intrusion measure 
used in the present study was adopted from 
Krause and Pilisuk (1961) and consisted of 
five categories: sighs, laughs, coughs, throat 
clearing, and deep breathing. 


METHOD 
Subjects 


The Mandler-Sarason Test Anxiety Questionnaire 
was initially administered to 391 females in intro- 
ductory psychology classes at the State University of 
New York at Buffalo. From the extreme ends of the 
continuum of scores on this questionnaire 40 high- 
anxious (HA) and 40 low-anxious (LA) Ss were 
selected for the study. The 40 HA Ss were selected 
from the upper 14.1% of the distribution, with the 
cutoff point at 125; and the 40 LA Ss were selected 
from the lower 16.1% of the distribution, with the 
cutoff point at 79. Half of each HA and LA group 
was assigned to either the high-stress or low-stress 
condition. There were then four groups, each con- 
sisting of 20 Ss. To equate the groups, the assign- 
ment of S to a stress condition was determined on 
the basis of Test Anxiety Questionnaire score and 
high school grade-point average (GPA), the latter 
being used as a rough index of intellectual function- 
ing. The mean GPAs of the four groups were ap- 
proximately equal, with the greatest discrepancy be- 
tween any two groups being less than one percentage 
point. 
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Stimuli 


The stimuli used were a complete set of TAT 
cards and a modified form of the Wechsler-Bellevue 
Digit Symbol scale. 


Recording 


A Wollensak Model T-1616 tape recorder was used 
for all recordings. The standard accessory micro- 
phone was adapted for lavaliere use and rested on 
S’s chest. Two tracks were used for all recordings. 


Procedure 


The time period between the administration of the 
Test Anxiety Questionnaire and the experiment 
ranged from 1 to 4 months, to minimize risk that Ss 
would recognize the relationship between testing and 
the experiment. In the experiment, each S was tested 
individually. The TAT was administered first, with 
the 31 TAT cards arranged in the order 1-20 for 
females, followed by the remaining 11 cards in nu- 
merical order. Standard TAT instructions were re- 
cited (Murray, 1943), and while S put on the 
microphone, the instructions were repeated. The E 
conducted necessary inquiry only on the first two 
cards. After S completed Cards 1-10 the TAT was 
interrupted and the modified Digit Symbol scale was 
administered. Essential TAT instructions were then 
repeated, and S completed the remaining 21 cards. 

The verbal measures of anxiety studied are based 
on the verbal responses to 10 TAT cards (11-20). 
The responses to these cards by HA and LA Ss un- 
der high stress were considered to reflect the results 
of maximum anxiety arousal, since these Ss had just 
received two failure reports. The HA and LA Ss in 
the low-stress condition, in contrast, had received 
positive feedback and were considered to have ac- 
climated to the test situation. The responses given to 
Cards 11-20 were transcribed by trained secretaries, 
and, to insure accuracy, were then independently 
edited by another secretary. 

Instructions. In the high-stress condition, in order 
to evoke a strong anxiety response, E adopted a 
detached, “cold” attitude, offered little support, and 
conducted the experiment in a businesslike manner. 
At the start of the experiment, the following ego- 
involving instructions were given by E: 


I work on the psychiatric ward of the Veterans 
Administration Hospital as a clinical psychologist, 
that is, someone who is interested in the analysis 
of personality. Right now, I’m going to give you 
the Thematic Apperception Test. This test is widely 
used to measure the kind of personality you are, 
and to find out the kinds of problems and con- 
flicts you have. It is also a test of imagination, 
which is one aspect of intelligence. Of course, 
your own personality and your own intelligence 
will determine whether you do well or poorly on 
this test. 


In the low-stress condition, Æ adopted a friendly 
and supportive manner and initiated the experiment 
with the following “neutral” instructions: 
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I’m a student in psychology here at the university. 
Some of us are doing some research to find out 
about the stories that people make up. Very little 
is known about this, and you are being asked to 
help in this early stage of research so that we can 
get an idea about what stories are told to different 
pictures, 


These instructional sets were followed by standard 
TAT instructions, After Card 10, an effort was made 
to intensify the anxiety reaction in the high-stress 
condition, and E said: 


Look, uh, you’re not doing too well, uh, maybe 
you're not yourself today. We’ll leave this for a 
few minutes and see if you change when we come 
back to it. Right now, I'll give you another test. 
This one comes from the Wechsler Adult Intelli- 
gence Scale, which is the best IQ measure we have. 
Again, your own intelligence will determine how 
well or poorly you do. 


This instructional set was followed by presentation 
of the modified Digit Symbol scale. For Ss in the 
low-stress condition, the modified Digit Symbol scale 
was introduced as an exploratory scientific effort to 
ascertain “the kinds of things that people find inter- 
esting.” Upon completion of this test, the differen- 
tial instructions were continued: E offered a mildly 
supportive comment in low-stress conditions, but 
intensified the failure experience in high-stress groups 
by saying: “Well, we’re not getting anyplace with 
this. Let’s see if there is any improvement in your 
stories.” After finishing the TAT the experiment was 
complete, though Ss engaged in a further task not 
pertinent to this study. In the high-stress groups, Ss 
were informed of the purpose of the experiment and 
sworn to secrecy before leaving the experimental 
room, 

Scoring. The Content Anxiety Scale, verb-adjec- 
tive ratio, type-token ratio, SDRs, and intrusions 
were scored directly from transcripts; SDRs and 
intrusions were scored from transcripts while listen- 
ing to the tapes. Except for segmental type-token 
ratios, which were defined as the number of dif- 
ferent words in 25-word units, interrater reliability 
was determined for the measures on the basis of 
independent scoring by two raters. Correlation co- 
efficients were based on the ratings of 80 stories 
(from 8 TAT protocols), and all reliability coefficients 
were acceptable, with the lowest being .89. Scoring 
criteria for each verbal measure adhered to estab- 
lished rules and procedures (Balken & Masserman, 
1940; Gleser et al, 1961; Krause & Pilisuk, 1961; 


Mahl, 1961). 
RESULTS 


Scoring proceeded as follows: For each 
subject the 10 stories were scored on each 
measure, and for each subject on each measure 
the scores were summed across 10 stories to 
provide a total score. For a given verbal mea- 
sure the data analysis consisted of the com- 
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parison of mean scores of the four anxiety- 
stress groups. 


Content Anxiety Scale 


For this measure, higher scores reflect a 
greater frequency of references to anxious 
content. Table 1 presents the mean scores for 
four groups, and the results (Table 2) indicate 
that this scale differentiated between the four 
groups. Both the anxiety and stress effects 
were significant, though the interaction effect 
was not significant. 


Type-Token and Verb-Adjective Ratios 


The comparison of mean scores on the 
type-token ratio indicated that this measure 
failed to differentiate between groups (Table 
2). Higher type-token ratios reflect greater 
diversity of vocabulary, and it had been ex- 
pected that anxiety would reduce vocabulary 
diversity. For the verb-adjective measure, 
higher ratios, that is, a relatively greater use 
of verbs than adjectives, were considered to 
reflect greater anxiety. Results with the verb- 
adjective ratio partially conformed with pre- 
diction; the anxiety effect was significant (p 
< .05, see Table 2), and the HA groups had 
higher ratios (M = 6.30) than the LA groups 
(M = 5.04). The lack of significant results 
with the stress and interaction effects, how- 
ever, was not in accord with expectation. 


Speech Disturbance Ratios 


SDRs were analyzed for Ah and non-Ah 
categories, since previous studies had suggested 
that anxiety is related to non-Ah but not to 
Ah speech disturbances (Mahl, 1959). The 
Ah ratio failed to differentiate between the 
groups (Table 2); for the non-Ah SDR, only 
the interaction effect was significant. Inspec- 


TABLE 1 


MEAN Scores or Four Anxiety-Srress Groups 
ON THREE VERBAL INDEXES OF ANXIETY 


Non-Ah SDR | Intrusions 


Content 


Condition 
HS | LS 


3.71 


h anxiety | 7.52 | 4.95 
re 3.79 


Low anxiety | 6.05 | 4.27 


Note.—Abbreviated: HS = high stress, LS = low stress. 
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TABLE 2 
ANALYSIS OF VARIANCE OF VERBAL INDEXES OF TRANSITORY ANXIETY FOR 
Four Anxiety-Stress Groups, MEAN SCORE VALUES 
Item df Content Type-token Verb-adjective Ah Non-Ah Intrusions 
Anxiety 1 2313 84 32.0* 2.6 11.4 10 
Stress 1 95.0*** 18 2.3 0.1 1.2 1.40** 
Interaction T 3.1 .01 1.0 5.8 13.9* 72 
Within groups 76 3:5 42 5.2 8.8 3.4 .19 
(errors) 
* 
“p S01 
wet p < 001. 


tion of the group means for the non-Ah SDR 
(Table 1), however, indicated that under high 
stress, LA subjects increased in the frequency 
of non-Ah disruptions but that, contrary to 
prediction, HA subjects produced fewer dis- 
ruptions under high-stress conditions. Bart- 
lett’s test for homogeneity of variance was 
computed and found significant (p < .05, x? 
= 8.56, df = 3), and the distribution of the 
variances showed increasing variability in the 
LA groups under stress but decreasing varia- 
bility in the HA groups. Thus, HA subjects 
under high stress showed decreases in both 
mean and variance scores. 


Intrusions 


Table 2 indicates that for the intrusion 
category only the stress effect was significant 
(p< .01), but, contrary to prediction, the 
high-stress groups produced fewer intrusions. 
While the interaction effect was not signifi- 
cant, further analysis with the Mann-Whitney 
U test showed that HA subjects yielded sig- 
nificantly fewer intrusions under high stress 
than under low stress (U = 2.88, p < 001), 
while there was no significant difference be- 
tween LA groups (U = .59, p> .25). Simi- 
larly, Bartlett’s test was significant ($ < .05, 
x? = 10.37), with the smallest variance in the 
HA high-stress group. It was concluded that 
there was a decrease in frequency and varia- 
bility of intrusions in HA subjects in the 
high-stress condition. Since intrusions are an- 
other category of speech disruptions, the vari- 
ous SDR measures (Ah and non-Ah) were 
combined with each other and with intrusions. 
None of the resulting F ratios was significant, 
though Bartlett’s test was again significant, 


and in each instance the smallest variance was 
in the HA high-stress group. 


Discussion 


Of five verbal measures, only the Content 
Anxiety scale varied as a function of both 
anxiety and stress condition. Though the in- 
teraction effect was not significant, the results 
with this measure provide support for the use- 
fulness of verbal content as an index of transi- 
tory anxiety and for the psychoanalytic the- 
orizing which led to the development of the 
Content Anxiety scale. These results also pro- 
vide support for the efficacy of the anxiety- 
stress manipulation. 

In contrast to the content measure, it was 
surprising to find that speech disruptions, 
both SDRs and intrusions, decreased under 
high stress, particularly among HA subjects. 
One possible explanation of the findings is 
that the choice of experimental stimuli, which 
included the experimenter as one stimulus ob- 
ject, inhibited the expression of speech non- 
fluencies and that subjects consciously at- 
tempted to control these “undesirable” speech 
habits. The obvious presence of tape-record- 
ing equipment may have contributed to a 
heightened awareness of the “process” of 
speaking; in addition, the experimental condi- 
tions or the experimenter may have empha- 
sized the production of logically coherent and 
well-organized speech, and this requirement 
might well have inhibited speech disruptions. 
In this vein, for the HA subjects under high 
Stress the present conditions might be remi- 
niscent of a stagelike performance in which 
“actors” have learned to recognize anxiety 
and to guard against any disruption in the 
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flow of speech. Contrasting with Mahl’s 
(1961) hypothesis that speaking “process” is 
often outside of the individual’s control, the 
results of the present study indicate that 
speech disturbances may be subject to volun- 
tary inhibition. Another possible explanation 
is that the speech-disturbance-anxiety rela- 
tionship does not hold at high anxiety levels. 
This possibility was demonstrated in a recent 
study by Kasl and Mahl (1965) which re- 
ported a curvilinear relationship between non- 
fluencies and anxiety level, when anxiety level 
was defined by scores on the Taylor Mani- 
fest Anxiety scale (Taylor, 1953). 

The differing results with the content and 
speech disruption measures have bearing on 
the “representational” versus “instrumental” 
models that have been employed in the study 
of human communication (Pool, 1959). 
Briefly, the representational model assumes 
that lexical content directly reflects the intent 
of the speaker and postulates an isomorphic 
relationship between content and feeling state. 
The instrumental view holds that the meaning 
of the message must be considered in terms 
of its situational context. Since content is 
subject to a variety of inhibitory factors ( e.g., 
lying), some theorists (e.g., Mahl, 1961) posit 
that the speaking “process” more directly re- 
flects underlying feeling tone. The present 
results support the representational viewpoint 
and suggest that at least in some (e.g., struc- 
tured) interactions, content is more useful 
than “process” as an index of transitory anzi- 
ety. The generalizability of this finding is 
questionable, since, as has been mentioned, 
the conditions of the present study may have 
fostered the use of content, rather than 
“process,” as a pathway for anxiety expres- 
sion. 

Turning to other measures, the significantly 
higher verb-adjective ratios of HA subjects 
are consistent with the Balken-Masserman 
(1940) contention that higher ratios point to 
greater anxiety and tension. While the stress 
difference was not significant, this may indi- 
cate that the use of high or low verb-adjec- 
tive ratios is “characterological” in nature, 
that is, this aspect of language behavior is 
independent of the nature of the verbal sam- 
Ple. The results with this ratio are most com- 
Parable to those of Benton, Hartman, and 
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Sarason (1955), who failed to obtain signifi- 
cant differences in verb-adjective ratios be- 
tween high- and low-anxious subjects when 
anxiety was measured by scores on the Taylor 
Manifest Anxiety scale. Though it is possible 
that the sampling of TAT cards contributed to 
the differing results, these divergent findings 
are more likely a reflection of the greater dis- 
criminatory power of a situation-specific anxi- 
ety scale in contrast to a general or manifest 
anxiety scale. The results of these two studies 
are consistent with research demonstrating 
that anxiety scales have greater predictive 
power when the item content of the scale 
closely approximates the experimental condi- 
tions in which the subject performs ( Sarason, 
1960). 

Consideration of the lack of significant 
results with the type-token ratio suggests that 
this measure is not useful in anxiety assess- 
ment. It is possible, however, that segmental 
ratios, despite their convenience for scoring, 
may not be as sensitive as the cumulative or 
total type-token ratio. 


Implications for Test-Anxiety Theory 


These findings have implications for test- 
anxiety theory (Sarason et al., 1960). Results 
with the content measure support the theoreti- 
cal contention that for HA subjects, the test- 
ing situation arouses guilt feelings and fears 
of retaliation via physical injury and loss of 
love. The scores on the Content Anxiety scale 
demonstrate that HA subjects under high 
stress are indeed more preoccupied with 
themes of guilt, shame, and physical and 
narcissistic injury. The theory also states 
that the anxiety reaction is a danger signal 
which warns the subject that the expression 
of hostility will lead to these fears of retalia- 
tion and guilt feelings, but the present study 
did not provide data to indicate that anxiety 
was generated by problems involving hostility 
expression. Indeed, it is probable that in- 
creased preoccupation with guilt and fear 
themes represents a generalized consequence 
of anxiety, irrespective of the nature of the 
impulse that triggers the anxiety reaction. 

The results for the type-token ratio appear 
to have little bearing on the theory, since the 
lack of statistical significance apparently 
stems from difficulties in the measure itself. 
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Results with nonfluency measures, in contrast, 
may have implications for the theory. Spe- 
cifically, test-anxiety theory states that in- 
creased anxiety leads to interfering task- 
irrelevant behavior in HA subjects, and speech 
disturbances can be considered prime exam- 
ples of such interfering responses. Contrary 
to the theory, the results with this measure 
may indicate that under some stressful condi- 
tions HA subjects can inhibit certain irrele- 
vant responses. 

An important finding of the present study 
was the failure to obtain significant interac- 
tion effects even when main effects were sig- 
nificant; that is, there was no differential 
increase in anxiety in HA subjects under high 
stress, This differential increase is predicted 
in test anxiety theory, since the anxiety re- 
action is presumed to occur only under threat- 
ening, evaluative conditions and not under 
“neutral” or low-stress conditions. It is quite 
possible, however, that in the present study 
anxiety was reactive to diverse stressful ele- 
ments contained in the low-stress condition 
(e.g., the tape recorder) and that as such, the 
present study did not provide an adequate 
test of the reactive anxiety hypothesis. 
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The 50 behaviors originally used by Wickman on adults were rated on a 
4-point scale of seriousness by 455 boys and 456 girls in Grades 7-12. By sex 
and grade level, the frequencies in each cell were compared by chi-square tests 
of independence to frequencies similarly generated by their teachers. Consider- 
ing the 10 behaviors rated most serious by the students and the 10 behaviors 
rated the least serious, disagreement was observed between boys and girls, 
among grades, and between students and teachers. Some of the behaviors gave 
rise to more disagreements than others. The relative orders of the behaviors 
as ranked by students and teachers were compared to the order established by 


Wickman’s original teacher group. 


Although he might not be considered ter- 
ribly profound, an individual could make the 
casual observation that certain behaviors of 
children are considered “problems,” while 
others are not. He might also observe that it 
is the adult in our society who most fre- 
quently makes this discrimination and who 
dictates the ensuing treatments, educational 
or otherwise. To the degree that the adult 
perceives the seriousness of the behavior dif- 
ferently from the child exhibiting that be- 
havior will the treatments he applies to that 
behavior be inappropriate in the eyes of the 
child. Consequently, the treatment applied 
may be of less than optimum effectiveness in 
changing that behavior. 

The present study was undertaken to pro- 
vide some data about the degree of consist- 
ency between children and teachers in terms 
of how they view the seriousness of “prob- 
lem” behaviors, A number of earlier studies 
have attempted to assess the consistency be- 
tween various adult subgroups, such as teach- 
ers and mental hygienists (Wickman, 1929), 
elementary teachers and clinicians (Stouffer, 
1952), secondary teachers and clinicians 
(Stouffer, 1956), and attitude changes be- 
tween these groups over the years (Hunter, 
1957; Schruppe & Gjerde, 1953). Other 
replications are presented in Beilen (1962). 
Little, if any, data are presently available 
concerning the degree of consistency between 
teachers and children themselves. 


PROCEDURE 


Four hundred children in Grades 7-12 in a 
northern Illinois town were administered a list of 
50 behaviors. which were used in the Wickman 
(1929) study and asked to briefly describe each 
behavior. The common description was then placed 
in parentheses after each original description so 
that the meaning of each behavior would be under- 
stood by the experimental Ss. The 50 behaviors 
with their common interpretations are listed in Table 
i 
Fourteen female and 26 male teachers of Grades 
7-12 in three northern Ilinois towns and their 455 
male and 456 female students were asked to rate 
each behavior on a four-point scale. The number 
of Ss per grade ranged 115-198, and their ages 
ranged 12-20 years. The professional experience of 
the teachers ranged 1-25 years. 

The three towns, somewhat semirural, each had 
less than 10,000 population and were within a 20- 
mile radius of a large metropolitan center. The 
communities were perceived by the authors to be 
somewhat above average in socioeconomic distribu- 
tion, and the school physical plants were considered 
modern and up-to-date. 

Neither students nor teachers were informed about 
comparisons to be made and were given only the 
following directions: 

Following this page, you will find a list of fifty 
behaviors which have been seen in children and 
adolescents. Some of these behaviors may never 
occur in some students, but assuming that they did 
occur, how serious would you consider them to 
be? Make your ratings according to your own 
personal opinion as to the seriousness of the 
behavior as you see it in others. Make your rat- 
ings quickly. Be sure to rate each of the fifty 
items. The words in parentheses may help you to 
better understand the meaning of each word or 


phrase. 
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TABLE 1 


RANK oF WICKMAN’s BEHAVIORS BASED ON STUDENT FREQUENCIES 


: Frequency in Total Seriousness 
Behavior most serious responses rank 
categories 

1. Stealing 798 908 1 
2. Cruelty, bullying (picking on others) 457 909 22 
3. Heterosexual activity (making out) 354 899 35 
4. Truancy (skip school) 513 901 13 
5. Unhappy, depressed (sad) 287 903 43 
6. Impertinence, defiance (talking back) 541 907 11 
7. Destroying school property 747 909 2 
8. Unreliableness (can’t depend on) 506 907 14 
9, Untruthfulness (lie) 717 908 3 
10. Disobedience (not obey, not do as told) 612 906 7 
11. Temper tantrums (temper outbursts) 550 905 9 
12. Resentfulness (against—dislike) 348 903 37 
13. Unsocial, withdrawing (not friendly) 364 905 31 
14. Obscene notes/talk (dirty notes, talk) 551 908 10 
15. Nervousness (jittery) 260 906 45 
16. Cheating (copying) 617 910 6 
17. Selfishness (not sharing) 284 905 44 
18. Quarrelsomeness (argue, fight) 445 909 25 
19, Domineering (bossy) 368 904 30 
20. Lack of interest in school 490 910 17 
21. Impudence, rudeness (not polite) 452 906 23 
22. Easily discouraged (give up) 493 908 16 
23. Fearfulness (afraid) 334 908 39 
24. Suggestible (easily led) 495 903 15 
25. Enuresis (wet the bed or the self) 644 906 5 
26. Masturbation (sex-playing with the self) 692 909 4 
27, Laziness (not active) 351 910 36 
28. Inattention (not paying attention) 397 904 27 
29, Disorderliness in class (acting up) 474 908 19 
30. Sullenness (sulk, pout) 339 908 38 
31. Physical coward (sissy) 379 906 28 
32. Overcritical of others (finding fault) 461 909 20 
33. Sensitiveness (easily hurt) 362 908 33 
34. Carelessness in work (messy) 320 906 42 
35. Shyness (bashful) 167 905 48 
36. Suspiciousness (suspecting others) 376 902 29 
37. Smoking (use of tobacco) 541 907 12 
38. Stubbornness (bull-headed) 327 906 40 
39. Dreaminess (day dream) 221 902 47 
40. Profanity (swearing) 586 909 8 
41. Attracting attention (cutting up in class) 459 908 21 
42. Slovenly in personal appearance (sloppy) 450 906 24 
43. Restlessness (over-active) 223 905 46 
44, Tardiness (late) 325 906 41 
45. Thoughtlessness (forgetting) 361 906 34 
46, Tattling (telling on others) 478 902 18 
47. Inquisitiveness (asking questions) 99 907 50 
48. Interrupting (butting in) 363 906 32 
49, Imaginative lying (exaggerating) 421 899 26 
50. Whispering (talking softly) 100 907 49 
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Please look at the first behavior of Stealing. If 
you think that stealing is not at all serious, place 
your check mark under that heading; if you think 
it is slightly serious, place your check mark in 
that column; and if you think that it is serious or 
very serious, place your check mark accordingly. 
Do the same for each behavior. There are no 
right or wrong answers. Be sure that your ratings 
represent your own opinions. Are there any ques- 
tions? 


Interactions between the dimensions of sex of 
student, grade level of student, and group status 
(teachers versus student) were analyzed by chi- 
square tests of independence on each behavior. The 
dimensions of sex, grade level, and academic spe- 
cialty of the teachers were excluded from the 
analyses because of the restricted number partici- 
pating. 

Efforts were concentrated only on the 10 behav- 
iors which showed the largest frequencies in the 
two most serious categories and the 10 items which 
showed the largest frequencies in the two least 
serious categories, as rated by the students. These 
are indicated by seriousness ranks of 1-10 and 41- 
50 in Table 1. Omitted from analysis were those 
behaviors showing a relatively rectangular frequency 
distribution in the four categories. 
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RESULTS 


The results of the 500 chi-square tests of 
independence are reported in Table 2 for the 
10 most serious and the 10 least serious be- 
haviors. 


Differences among Students 


Sex differences. Male students differed from 
female students on all 10 of the most serious 
behaviors and on 6 of the 10 least serious. 
The fewest number of disagreements between 
sexes occurred in Grade 8 (2 behaviors) and 
the greatest in Grade 9 (15 behaviors.) At 
other grades the number of behaviors on 
which disagreements between sexes were ob- 
served ranged 5-7. Female students were 
inclined to rate both groups of extreme be- 
haviors as more serious than boys (from raw 
frequencies). 

Grade differences. Disagreements were 
noted among grades on 7 each of the 10 most 
serious and the 10 least serious behaviors. 


TABLE 2 
SUMMARY OF TESTS OF INDEPENDENCE FoR 10 Most SERIOUS AND 10 Least SERIOUS BEHAVIORS 
Most serious Least serious 
P 
3 £ 3 
4 Ble g 2/3 g 
Comparison EEE: BIR 2 5 | w 
22/2 | 3 wlilelflelalé i »lelil2 Els 
w lho] 3 Ial lalag] g3] a] 8lel4 ME 
SHEPRTRLCRTE EE E elas 
£16/3] 5 s/s E El 3 $ METE: 
F A slslalélaléléls/é/al2]s Alala|é|e 
Among grade levels 01 | 40 | 05 | .10 10 1 Or] | 01 | -10 | 05 10 | .01 40] 01 
Bele sence 01 ‘10 | 01 05 | .05 | ‘os 110 05 «05 | .01 «01 | <05 
mae o1 | .o5 | 01 | .01 | .01 | .01 | .01 | 01 | .01 | .01 | .01 01 | .01 | .05 05 01 
Grade 7 j 201 110 05 205 | 10 ae 
Grade 9 05 | .05 | .01 | .01 | .o1 | .01 | .05 | .05 | <01 | .01 | .05 | .10 | .01 PA 10 10 
Grade 10 “05 | .05 | 01 os | 110 10] ute Ve 
Stade i2 10 os |" o1 10 ‘05 | .10 | .o1 
aan ee 01 | .01 05 | .01 05 .05 | .01 01 
Male students vs. 05 0S 01 OL 
eee 0s 05 5) 305 | 05 | .05 | .o1 0S ‘01 | .0s 
Grade $ y 201 05 OL 08 110 05 01 | 05 a 
‘05 | .10 | [10 |i i ‘on | 4 4 
Grade 10 OL 10 .05 03 08 05 “01 | «10 
Grade i2 ‘ot | 05 105 :05 sot | :10 10 | 05 +10 | .05 
Female students vs. | 10 ‘01 | 01 {01 | .01 105 alas as 
teachers 10 x 
10 01 | .01 05 2 
gaa kio $1 | 08 ‘to | ‘01 ‘01 01 | -01 ot | 10 r 
Grade 9 10 ulcer 01 | or $0] j 110 
eae KAE 05 | :10 01 110 05 
Gate i2 on 01 | 205 110 | <05 10 05 05 10 | .10 
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Female students differed over grades more 
frequently than boys (girls differed over 
grades on 7 of the 10 most serious behaviors 
and on 5 of the least serious, while boys dif- 
fered over grades on 3 of each). 

Sources of disagreement. Most disagree- 
ments between the sexes of students occurred 
on the behaviors of masturbation (four of 
six grades), disobedience, untruthfulness, 
and temper tantrums (where sex differences 
occurred at three of the six grades in each). 


Differences between Students and Teachers 


Differences by sex of student. Little dis- 
tinction was noted between the sex of the 
students with regard to the number of dis- 
agreements they had with the teachers. Boys 
generally disagreed with the teachers on the 
same behaviors as did girls, at least on the 
10 most serious behaviors. Of the 26 disagree- 
ments of boys with teachers on these behav- 
iors, all but 1 was also in disagreement by 
girls and teachers. Of the least serious be- 


haviors, only half the disagreements between 
boys and teachers were upheld by the girls. 
The boys appeared to have an approximately 
equal number of disagreements with the 
teachers on the 10 most serious behaviors and 
the 10 least serious (26 versus 27). Most of 
the disagreements between girls and teachers 
occurred on the most serious behaviors (31 
versus 19). The total number of disagree- 
ments with teachers appeared equal for the 
two sexes (53 versus 50). 

Differences by grade level. Although no 
single grade level appeared to have a greater 
number of disagreements between students 
and teachers than another, Grade 10 did ap- 
pear to have fewer. Here, boys showed dis- 
agreements with teachers on 5 behaviors, 
while girls disagreed with teachers on 6. At 
other grades the number of behaviors on 
which disagreements with teachers occurred 
ranged 9-11 for boys and 8-10 for girls. 

Sources of disagreement. Students as a 
whole differed more frequently with teachers 


TABLE 3 
‘Twenty EXTREME RANKINGS OF BEHAVIORS BY PRESENT STUDENTS AND TEACHERS AND BY WICKMAN’S TEACHERS 


Students Present teachers Wickman teachers 
Most serious 
1, Stealing 1. Destroying school property 1. Heterosexual activity 
2. Destroying school property 2. Stealing 2. Stealing 
3. Untruthfulness 3. Untruthfulness 3. Masturbation 
4. Masturbation 4. Cheating 4, Obscene notes/talk 
5. Enuresis 5. Obscene notes/talk 5. Untruthfulness 
6. Cheating 6. Disobedience 6. Truancy 
7. Disobedience 7. Cruelty, bullying 7. Impertinence 
8. Profanity 8. Easily discouraged 8. Cruelty, bullying 
9, Temper tantrums 9. Unreliableness 9. Cheating 
10, Obscene notes/talk 10. Impertinence, defiance 10. Destroying school property 
Least serious 
41, Tardiness 41. Tardiness 41. Dreaminess 
42. Carelessness 42. Tattling 42. Imaginative lying 
43. Unhappy 43. Selfishness 43. Interrupting 
44. Selfishness 44. Smoking 44. Inquisitiveness 
45. Nervousness 45. Physical coward 45. Over-critical 
46. Restlessness 46. Sensitiveness 46. Tattling 
47. Dreaminess 47. Whispering 47. Whispering 
48. Shyness 48. Inquisitiveness 48. Sensitiveness 
49. Whispering 49. Shyness 49, Restlessness 
50. Inquisitiveness 50. Restlessness 50. Shyness 
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on the 10 most serious behaviors than on the 
10 least serious (5 versus 3). This extent of 
disagreement with teachers was upheld when 
considering the sex and the grade level of the 
students (57 versus 46). Of the 10 most 
serious behaviors, the largest number of dis- 
agreements was observed on the behaviors of 
masturbation, enuresis, disobedience, profan- 
ity, and obscene notes/talk. The fewest num- 
ber of disagreements within this extreme 
group was observed for the behaviors of de- 
stroying school property and temper tan- 
trums, Of the 10 least serious behaviors, the 
largest number of disagreements was found on 
the behaviors of restlessness and nervousness. 
The smallest number of disagreements within 
this group was found for the behavior of 
dreaminess. 


Composite Teacher Ranking versus 
Composite Student Ranking 


A composite teacher ranking of the serious- 
ness of the 50 behaviors was derived by the 
same technique as that employed earlier in 
identifying the 10 most serious and 10 least 
serious behaviors on the basis of student re- 
sponses. The 10 most serious and 10 least 
serious behaviors, as perceived by the teach- 
ers, as well as the comparable lists for stu- 
dents and for teachers in the Wickman 
(1929) study, are presented in Table 3. 

Of the 10 most serious and the 10 least 
serious behaviors on the student list, 60% 
also appear on the present teacher list, while 
60% and 50%, respectively, appear on Wick- 
man’s teacher list. Of the 10 most serious 
and the 10 least serious behaviors on the 
present teacher list, 60% also appear on 
Wickman’s teacher list. 

Four of the behaviors in each extreme of 
the student list did not appear in the same 
extreme of the present teacher list. Five of 
these eight behaviors had composite teacher 
ranks within a fifth adjacent to their location 
in the student list, while two others barely 
missed falling into an adjacent fifth. One be- 
havior (unhappiness) nearly became one of 
the 10 most serious behaviors on the teacher 
list, while it was among the 10 least serious 
on the student list. 
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CONCLUSIONS 


The data would suggest that considerable 
disagreements exist concerning the serious- 
ness of certain behaviors. These disagreements 
were observed between teachers and students, 
as well as among students themselyes with 
respect to sex and grade level. 

At least 5 of the 10 most serious and 3 of 
the 10 least serious behaviors have been iden- 
tified as those which appear to cause the 
greatest disagreement between students and 
teachers. Teachers in the present study rated 
the behaviors of masturbation, enuresis, pro- 
fanity, and restlessness as being less serious 
than did students and tended to rate the be- 
haviors of disobedience, obscene notes/talk, 
nervousness, and unhappiness as being more 
serious than did students. 

To the degree that the above data can be 
generalized to other school populations, teach- 
ers will need to consider the implications of 
these differing perceptions of seriousness when 
dealing with student behavior. Because of the 
apparent variation within this subgroup of 
junior high school and high school children, 
the teacher’s task is made more difficult in 
selecting an “appropriate” treatment. These 
grade levels would appear to be no place for 
stereotyped reactions to student behavior 
patterns. 
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FACTORS RELATED TO IMPROVEMENT IN MENTAL 
HOSPITAL PATIENTS * 


PETER M. LEWINSOHN 


University of Oregon 


The aim of this investigation is to study the relationship between a number of 
presumed prognostic indicators and several different aspects of improvement in 
mental hospital patients. 165 patients from an acute and intensive treatment 
hospital served as Ss. Data were collected at times of admission, discharge, 
and 6 mo. after discharge. 7 different aspects of improvement were measured. 
Results were as follows: (a) Some of the prognostic variables, for example, 
chronicity, were not found to be related to any of the improvement measures; 
(b) many of the prognostic variables were found to be related to some aspects 
of improvement but not to others; (c) in a few instances these relationships 
were found to be a function of the sex of the patients, 


There have been many studies concerned 
with prognostic indicators, but there is rela- 
tively little empirical support for the predic- 
tive power of many of these factors. The 
voluminous literature in this area has been 
reviewed by Zubin (1959), Windle (1952), 
Fulkerson and Barry (1961), Huston and 
Pepernik (1958), and others, all of whom 
have called attention to the contradictory 
nature of many of the findings. 

Interpretation of many prognostic studies 
is complicated by the possibility that improve- 
ment (defined as a gain of some sort from 
one point in time to another) may have been 
confounded with initial or terminal level of 
adjustment. This difficulty stems from the 
nature of the outcome criteria which have 
been employed. The three outcome criteria 
most frequently used in prognostic studies 
have been duration of illness, which has 
typically been operationally defined by length 
of hospitalization (e.g., Phillips, 1953), staff 
ratings of improvement (e.g., Astrup, Fos- 
sum, & Holmboe, 1962), and ratings of social 
adequacy after discharge from the hospital 
(e.g., Schofield, Hathaway, Hastings, & Bell, 
1954). While these outcome measures possess 
considerable practical value, their relationship 


1This study was supported in part by Public 
Health Service Research Grants (M5548-A and 
M6029-A) from the National Institute of Mental 
Health and by a grant from the Illinois Psychiatric 
Training and Research Authority. Parts of this 
paper were presented at the 1964 meeting of the 
Midwestern Psychological Association in St. Louis, 
Missouri. 


to improvement is not clear. The use of 
length of hospitalization, for example, as a 
criterion of improvement assumes that the 
patient who is discharged from a hospital has 
improved more than the patient who is not 
discharged within the same length of time. 
It is equally plausible, however, that the pa- 
tient who is discharged was less disturbed to 
begin with than the patient who was not dis- 
charged, that both patients improved by about 
the same amount and at the same rate, and 
that the initially less disturbed patient was 
discharged sooner. The same reasoning could 
be applied to ratings of improvement and of 
social adequacy after discharge, namely, that 
they may reflect differences between patients 
in level of adjustment or severity of disturb- 
ance, but not in improvement. It is thus not 
clear whether some of the previously identified 
prognostic variables are related to improve- 
ment or to initial severity of disturbance. 
The present investigation is concerned with 
the prediction of the changes which occur in 
mental hospital patients during their stay in 
an intensive treatment psychiatric hospital 
and in their social adjustment following dis- 
charge. The following seven measures of im- 
provement were included: (a) reduction in 
Psychotic symptoms, (5) reduction in anzi- 
ety, (c) improved conceptual functioning, (d) 
time spent in hospital from admission to dis- 
charge, (e) staff ratings of improvement, (f) 
social adjustment after discharge from the 
hospital, and (g) ability to stay out of a men- 
tal hospital for 15 months following discharge. 
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The major aim of the present study is to 
evaluate the relationship between these mea- 
sures of improvement and a number of prog- 
nostic variables. 


PROCEDURE 
Subjects 


Patients from an acute and intensive treatment 
psychiatric hospital? were used in this study. De- 
scriptive information about the Ss is given in 
Table 1, Patients received whatever treatment, or 
combination of treatments, seemed best indicated 
in the judgment of the psychiatric team and the 
patient’s physician, Kind of treatment received by 
the patients was recorded and is also shown in 
Table 1. 

A variety of data was collected on many of 
these patients over a period of about a year and 
a half—at times of admission, discharge, and 6 
months after discharge. Patients who were dis- 
charged to another institution were not included 
in this study. 


Outcome Criteria 


1, Ratings of psychoticism. At admission and dis- 
charge, 117 patients were individually interviewed, 
the two interviews being carried out by different 
staff psychologists. Immediately following the inter- 
view the psychologist rated the patient on a 77-item 
rating scale which consisted of the in-patient and 
out-patient forms of the Multidimensional Scale 
for Rating Psychiatric Patients (MSRPP). The 
items were scored for six of the factors (Paranoid 
Projection, Perceptual Distortion, Conceptual Dis- 
organization, Motor Disturbance, Suspicious-Hostile, 
and Latent Schizophrenia) which were identified in 
factor analyses reported by Lorr (1953) and by 
Lorr, Rubinstein, and Jenkins (1953). All of these 
factors are defined by items which deal with the 
Presence and severity of psychotic symptoms of 
various kinds. Their intercorrelations were judged 
to be of sufficient magnitude to warrant combining 
the separate factor scores into a single psychoticism 
Score. Interrater reliability of this score, based upon 
the correlation of the ratings with those of another 
Psychologist who observed some of the interviews 
through a one-way screen, was .91 (n= 16). 

2. IPAT anxiety, As part of a battery of psycho- 
logical tests, administered shortly after admission 
and just prior to discharge, the patients (W = 143) 
Ee 


?The data for this study were collected at the 
Larue D. Carter Memorial Hospital, Indianapolis, 
Indiana, The author wishes to express his apprecia- 
tion to the members of the hospital staff for their 
Continuing help, support, and encouragement in this 
study, 

3 The interviews were conducted by Myron How- 
land, James Lomont, Robert Nichols, Herbert 
Nickel, Lee Pulos, George Siskind, and the author. 


TABLE 1 


Descriptive INFORMATION ABOUT 
Patient POPULATION 


Proportion of Marital status: 
males 39] Single 34% 

M age (in yr.) 32.1 Married 53% 

M education Separated, wid- 

(in yr.) 11.4 owed, divorced 13% 

M length of hos- 
pitalization (in Occupational group: 
days) 172 Professional 11% 

First admission 82% Managerial 5% 

Sales, clerical, 

Type of commitment: skilled 14% 
Voluntary 83% Labor, service 
Temporary 16% work 16% 
Court 1% Housewife, stu- 

dent 54% 

Diagnosis: 

Schizophrenic Treatment: 

Teaction 44%, ECT 53% 
Other psychotic 10% Tranquilizer 82% 
Psychoneurotic 23% Psychotherapy 40% 
Personality 23% Insulin coma 4% 


Note.—N = 165. 


had taken the IPAT Anxiety Questionnaire (Cattell, 
1957), which was scored for total anxiety, 

3. Conceptual functioning. Of the patients, 147 
had taken the Vocabulary and Abstraction subtests 
of the Shipley-Hartford Retreat Scale (S-H; Ship- 
ley, 1940) and the clinical form of the Gorham 
Proverbs Test (Gorham, 1956) at admission and at 
discharge. Alternate forms of the Proverbs test were 
used at admission and discharge, Both tests were 
scored according to standard directions. 

4. Staf ratings of improvement. Ratings of im- 
provement were obtained from the patients’ physi- 
cians or therapists at time of discharge for 119 of 
the patients. Four five-point scales dealing with 
symptoms, personality adjustment, and subjective 
feeling of well-being were used for this purpose, 
and the correlations between these ratings were 
computed. Since the correlations were all posi- 
tive and statistically significant, it was decided to 
combine them into a single improvement score. 
The Ss were then dichotomized into more improved 
(MI) and less improved (LI) ® groups by splitting 
Ss, separately for male and female Ss, at the median, 


4This material has been deposited with the Ameri- 
can Documentation Institute. Order Document No. 
9654 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress, 
Washington, D. C. 20540. Remit in advance $2.00 
for microfilm or $3.75 for photocopies, and make 
checks payable to: Chief, Photoduplication Service, 
Library of Congress. 

5The terms more improved and less improved 
are used throughout to convey the relative positions 
of the two groups on the improvement dimension, 
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3. Social adjustment ratings. From the larger 
sample, 45 patients were selected for follow-up 
study 6 months after discharge from the hospital. 
Description of the sample and of the follow-up 
procedures has already been presented in greater 
detail elsewhere (Lewinsohn & Nichols, 1964). In 
most respects this group was similar to the larger 
group of patients. Patients were requested to bring 
a responsible relative for the follow-up evaluation. 
The relative was interviewed by a psychiatric social 
worker ® concerning the patient’s behavior at three 
points in time: (a) before he became acutely ill, 
(0) immediately prior to hospitalization, and (c) 
at the time of follow-up. On the basis of this in- 
formation the social worker rated the patient’s 
level of functioning on three parallel 10-point 
scales, representing the three points in time, on 
items dealing with each of the following general 
areas of functioning: (a) personal responsibility, 
(b) employment and work adjustment, (c) financial 
status, (d) interpersonal relationships, (e) participa- 
tion in activities outside the home, and (f) overall 
social adjustment. Total scores for each of these 
areas and the correlation between them were com- 
puted (see Footnote 4) and found to be positive 
and statistically significant. Only the overall adjust- 
ment rating was used for this study. 

6. Length of hospitalization. The number of days 
between admission and discharge was determined 
for each patient. The patients were then divided 
into “short,” “intermediate,” and “long” hospitaliza- 
tion groups by splitting the sample, separately for 
male and female Ss, into lower, middle, and upper 
thirds. 

7. Rehospitalization. Two years after every pa- 
tient in this study had been discharged, the records 
of the State Department of Mental Health’ were 
searched to determine which of the patients had 
been rehospitalized in any of the state institutions. 
Individual physicians and therapists were also con- 
tacted about their knowledge of patients who had 
been rehospitalized in private or out of state hos- 
pitals. Date of rehospitalization was recorded. As 
there was a relatively large number of patients who 
had been rehospitalized between 12 and 15 months 
after discharge, 15 months was used as the cutting 
score, Of the patients, 32 (13 males and 19 females) 
had been rehospitalized. To compare the rehospital- 
ized with a nonrehospitalized group, rehospitalized Ss 
were individually matched for age and sex with non- 
rehospitalized Ss taken at random from the total 
sample, 


Statistical Considerations 


The measurement of change, or gain, presents 
numerous psychometric difficulties (Harris, 1963). 


The author gratefully acknowledges the help 
received from members of the social services staff 
at Carter Hospital, especially from John Murray, 
who was responsible for the major portion of the 
work in developing the scale. 

7 The author is indebted to Marjorie May for her 
help with this phase of the study. 
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Since the usual objective of treatment is to help 
the patient to improve in areas of greatest deficiency, 
a case can be made for the use of difference scores 
to represent change. The use of difference scores, 
however, did not seem justified because these scores 
tend to be correlated with admission scores, and, 
thus, the measure of improvement would have 
been confounded with initial level. An alternative 
is to compute residual gain scores, which involves 
the construction of linear regression equations with 
which to predict discharge scores from the admis- 
sion scores, Estimated discharge scores are then 
subtracted from actual discharge scores, and these 
residuals represent the amount by which S’s dis- 
charge score is above or below the score which 
would have been predicted for him on the basis of 
his admission score. Residual gain scores are thus 
independent of initial level. Their use was rejected 
for the following reasons: (a) When the correla- 
tions between admission and discharge scores are 
low, they primarily reflect discharge status and not 
necessarily improvement; (b) the use of residual 
gain scores makes assumptions about the nature 
of the metric, linearity of relationship, absence of 
interaction and ceiling effects, homoscedasticity, etc., 
which probably were not warranted by the data. It 
was therefore decided to use the admission data as a 
control variable to select MI and LI groups for the 
Psychoticism, Anxiety, Abstraction, Proverbs, and 
Social Adjustment measures in the following way: 
Separately for each of these outcome measures pa- 
tients were subdivided, on the basis of the magnitude 
of the difference scores, into two groups. One of 
these (MI group) consisted of patients whose dif- 
ference scores indicated substantial positive change, 
while the other (LI group) showed minimal change 
or change in a negative direction on the measure. 
These assignments were made solely on the basis of 
the change scores, all other identifying informa- 
tion having been removed. Pairs of Ss from the 
two subgroups were then closely matched on ad- 
mission level to constitute MI and LI groups for 
each of the criterion groups. Mean change scores 
for each of these groups are shown in Table 2. 
This procedure has the advantage of insuring that 
MI and LI groups differ in regard to amount of 
improvement shown on the measure while keeping 
initial level constant; it has the disadvantage of 
involving a slight loss of power, since a few SS 
could not be matched for admission level, thus 
causing a reduction in N. 

Information for the prognostic variables came 
from two sources: (a) Psychiatric social workers 
(SW) rated the patients on scales (see Footnote 4) 
dealing with selected aspects of the patients’ back- 
ground and current life situation. The SW ratings 
were made on the basis of an interview with one 
or more close relatives of the patient shortly after 
the patient was admitted to the hospital. The rating 
scales were constructed with the active collabora- 
tion of the social workers who made the ratings. 
The use of the scales was continuously clarified 
in individual and group sessions with the raters. 
(b) Other aspects of socioeconomic background and 
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TABLE 2 

MEAN SCORES AND MEAN CHANGE SCORES OF GROUPS ON OUTCOME MEASURES 

More improved Less improved More improved Less improved 
Group Group 

M SD N M | SD N uM SD N M | SD N 

M change on MSRPP Psychotocism* M change on Proverbs* 
Male 17.6 | 11.8 | 17 6.5 | 17.0} 17 Male 48 | 3.2 24 |—-18) 3.4 24 
Female | 13.3 | 10.6 | 29 0.9 | 11.7 | 29 Female 6.9 | 3.5 35 |i=2.3 | 3,3 35 
M change on IPAT Anxiety* Staff ratings of change 
Male 22.5 | 12.4 | 28 5.4 8.5 | 28 Male 17.8 | 1.9. 28 14.4] 1.5 27 
Female | 22.9 | 10.2 | 37 2.0 8.0 | 37 Female | 18.8 | 1.1 23 14.9] 1.3 26 
M change on Shipley-Hartford Abstraction* Follow-up overall adjustment changes? 
Male 73. | 43) 25 | 0.7 3.3 | 25 |Combined 
sexes IS E2 17 16} 11 17 
Female | 10.0 6.3 | 33 0.3 6.6 | 33 
Length of hospitalization (days) 
Short Intermediate Long 
Group 
M SD N M SD N M SD N 

Male 66 28 28 159 23 29 330 158 28 
Female 80 27 43 158 21 43 290 97 42 


* Admission to discharge. 
b Admission to 6 months after discharge. 


premorbid adjustment were rated by two research 
assistants ® on the basis of the patient’s completed 
hospital chart (C). Interrater reliability of the C 
ratings was assessed by having the two judges in- 
dependently rate the same 20 charts. Interrater 
agreement (see Footnote 4) ranged from excellent 
for some items to only fair for others. 

The correlation matrix of the SW ratings for the 
Ss was computed (see Footnote 4). The correlation 
among some of the indivdual items was judged to 
be sufficiently large to warrant combining correlated 
items into a single score. Scores for items dealing 
with financial security, family support, sociability, 
favorable childhood conditions, history of acting out 
(0-18 years), and history of neurotic symptoms 
(0-18 years) were thus computed for each patient. 
By a similar process, scores for items representing 
current stress, social withdrawal, sexual adjustment, 
and financial difficulties (0-18 years) were computed 
for each patient on the basis of the C ratings. — 

Chi-square was used where the independent vari- 


8 Frances Clark and Betty Green were responsible 
for making these ratings. 


able was categorized; where it was continuous, 
analysis of variance was used, The N varies for the 
different comparisons because: (a) For administra- 
tive reasons some patients were placed on treatment 
before all the procedures could be completed. Some 
of the data were also needed at times by members 
of the hospital staff and a few records were lost 
in this way. (b) The SW and C raters had been 
instructed to leave blank any variable for which 
insufficient information was available. Two-tailed 
tests of significance were made in all instances. 


RESULTS AND Discussion 


Means for the MI and LI groups on all 
prognostic variables were computed, and 
mean differences were tested for statistical 
significance (see Footnote 4). The results 
are presented in summary form in Table 3. 


The assistance of Dorothy Evans and Alan 
Wernick with the extensive data analysis is grate- 
fully acknowledged. 
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The following seven prognostic variables were 
not found to be significantly related to any 
of the outcome criteria: history of previous 
hospitalization, history of previous episodes, 
duration of symptoms, type of onset (slow 
versus fast), vocabulary score, history 
of neurotic symptoms, and financial difficulties 
in the patient’s family before the age of 18. 
The lack of relationship between the indexes 
of chronicity and the measures of improve- 
ment is at variance with some of the earlier 
findings cited by Zubin (1959), while the 
negative results for vocabulary level do not 
substantiate a previously reported relationship 
between intelligence and improvement (Stot- 
sky, 1952). Two possible explanations can be 
offered for these discrepancies. (a) In the 
earlier studies no attempt was made to dif- 
ferentiate between initial degree of disturb- 
ance and actual amount of change (improve- 
ment) shown. The reported relationships for 
chronicity and intelligence could thus have 
come about if the more chronic or less intel- 
ligent patients were also the more severely 
disturbed and were judged as having “im- 
proved” less because after treatment and 
hospitalization they were still more disturbed 
than the less chronic or more intelligent 
patients. (b) The present negative findings 
may be a function of the particular patient 
population used. In terms of length of illness 
and number of previous hospitalizations, the 
present sample certainly was not very 
“chronic.” Whether a more chronic sample 
than the present one would have yielded re- 
sults more consistent with the earlier studies 
can only be answered by further research. 
Other prognostic variables were found, in 
varying degrees, to be related to some aspects 
of improvement but not to others. Thus, pre- 
cipitating stress was related only to improve- 
ment on Anxiety (the MI-Anxiety group was 
rated as having less precipitating stress), 
while occupational level was specifically 
related to improvement in conceptual func- 
tioning (the MI Proverbs and Abstraction 
groups had higher rated occupations). Diag- 
nosis, on the other hand, was predictive of 
the staff improvement ratings and of length 
of hospitalization (the group rated more im- 
proved had a lower proportion of patients 
with diagnoses of personality pattern distur- 
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bance, and patients who were discharged 
sooner had a lower proportion of patients with 
diagnoses of schizophrenia). Marital status 
and the sexual adjustment ratings were the 
only prognostic variables which came close 
to being related to all the improvement 
measures. In 13 of the 14 comparisons the 
MI groups had a higher proportion of cur- 
rently married and better sexually adjusted 
patients. The main conclusion suggested by 
the present findings is that the efficacy of 
prognostic variables is a function of the im- 
provement criterion used; that is, prognostic 
variables were found to be related to some 
aspects of improvement but not to others. 

When Table 3 is examined in terms of the 
relative predictability of each of the improve- 
ment criteria some interesting trends emerge. 
Not one of the prognostic variables was 
found to be significantly related to rehospital- 
ization. A similar lack of relationship between 
a large number of prognostic variables and 
rehospitalization was reported by Marks, 
Stauffacher, and Lyle (1963). Next most 
unpredictable were the staff improvement 
ratings. The constellation of prognostic vari- 
ables found to be associated with improve- 
ment on anxiety indicated that a reduction 
in reported feelings of anxiety is in part de- 
pendent upon the absence of environmental 
stress of an interpersonal or financial nature. 
Improvement on Abstraction was found to 
be related to financial security and to the 
patient’s and his father’s occupational level. 
This constellation of prognostic variables sug- 
gests a relationship between the potential for 
improved conceptual functioning and social 
achievement or competence (Phillips & Zigler, 
1961). Further support for this hypothesis 
comes from the obtained relationship between 
patient’s occupational and educational level 
and improvement on Proverbs. Improvement 
in social adjustment after discharge showed 
the expected relationships with marital status, 
sexual adjustment, and the Phillips Scale of 
Premorbid Adjustment. 

Factor-analytic studies of improvement 
have reported finding a number of different 
and statistically independent dimensions of 
patient change (Cartwright, Kirtner, & Fiske, 
1963; Lewinsohn & Nichols, 1967; Nichols 
& Beck, 1960; Overall, Gorham, & Shawver, 
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1961), indicating that improvement is a 
multidimensional phenomenon. The present 
finding of wide differences between improve- 
ment criteria in regard to the constellation of 
prognostic variables associated with each is 
consistent with the multidimensional hypo- 
thesis. The findings also indicate that the 
practice of combining several criteria into 
a single measure (Knight, 1941; Tucker, 
1953) is not justified in the study of prog- 
nosis. 
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RORSCHACH IN RELATION TO OUTCOME IN PSYCHO- 
THERAPY WITH COLLEGE STUDENTS 


JOHN M. WHITELEY AND 


Washington University 


GRAHAM B. BLAINE, Jr. 


Harvard University 


The focus of this study is on Rorschach factors related to outcome and 
duration of psychotherapy with college students. The sample was composed of 
50 male students who sought psychotherapy from the Harvard University 


Health Service. Specific Rorschach signs 


based on previous research and the 


Klopfer Rorschach Prognostic Rating Scale did not distinguish between the 
outcome or duration groups. A discriminant-function approach using total K, 
m, and R was the most productive, with significant differences between the 
No Change group and both Change (p < .05) and Symptomatic Improvement 


(p <.01) groups. 


Rorschach’s test has proven to be a popular 
clinical instrument and a subject for un- 
usually thorough analysis, experimentation, 
and review. In specific relation to psycho- 
therapy, it has had extensive usage as a diag- 
nostic tool for identifying type and extent of 
personality disturbance, thought processes, 
and defense mechanisms. Much of the recent 
Rorschach research in psychotherapeutic 
practice has been centered in clinical or 
hospital settings, with the problems of identi- 
fying patients who terminate early and 
those who remain to complete the course of 
therapy. 

The focus of this study is related to the 
bulk of previous Rorschach research in psy- 
chotherapy, but is more specific; namely, it 
attempts to determine factors related to out- 
come and duration of psychotherapy with 
college students. Psychotherapy research with 
college students involves a different emphasis 
from work with hospitalized patients. With 
a college population the focus is on ego proc- 
ess, which is considerably different from the 
emotional reconstruction and reeducation 
which forms a major part of therapy with a 
hospital population. The major common de- 
nominator of a hospital group is impairment 
severe enough to make outpatient treatment 
inadvisable or impossible. Since most studies 
of the Rorschach and therapy have been con- 
ducted with hospitalized patients, they are 
not directly relevant to the work of the 
college psychiatrist and psychologist. y 

Zubin and Windle (1954), following their 
extensive review of the literature, noted that 
“what is most needed is confirmation or re- 
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futation of existing claims of prognostic effi- 
ciency rather than new claims [p. 278].” In 
line with their recommendation, this study 
is an attempt to determine the utility of 
methods which successfully identify termina- 
tors and continuers, with the related problem 
of identifying those who change and do not 
change as a result of psychotherapy. Fol- 
lowing this application of previous techniques, 
a new discriminant equation has been de- 


veloped. 


METHOD 


Sample Selection 


The research sample was composed of 50 male 
college students who sought psychotherapy from the 
Harvard University Health Service psychiatric staff 
in the fall of 1963 and fall of 1964 and who were 
subsequently referred for psychological testing, in- 
cluding the Rorschach, to the Psychological Service 
of the University Health Service. These 50 students 
subsequently began psychotherapy, and therapy out- 
come ratings for 1963-64 or 1964-65 were made. 


Therapy Classification 

Students were classified as having had short-term 
therapy if they had received 3-24 sessions; long- 
term classification applied to students receiving 
25-100 sessions. The lower cutoff of 3 sessions was 
established because the initial 2 interviews were 
generally evaluative and diagnostic—the first by a 
psychiatric social worker and the second by the sub- 


sequent therapist. 


Outcome Ratings 

The method for evaluating psychotherapy outcome 
represented an elaboration of a method developed 
originally by Baumrind (1959). The seven psychia- 
trists who had served as therapists were asked to 
assign their former patients to one of the following 
five categories after reviewing their case notes: 
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1. Worse would signify that the clinician felt 
that therapy had contributed to poorer adjust- 
ment in some specific way. 

2. No Change would signify that no significant 
change in clinical appearance could be demon- 
strated. If something other than therapy made 
the patient worse, he should be classified as No 
Change. 

3. Symptomatic Improvement would imply that 
the clinician was not sure of the saliency, stability, 
or significance of the changes in behavior. When 
unsupported by other data, a patient’s report indi- 
cating relief from anxiety or depression would 
usually constitute symptomatic improvement. 

4. Some Improvement would mean that there 
had been some improvement, but that the pa- 
tient’s clinical appearance indicated that fur- 
ther improvement was required in order to con- 
sider the patient reasonably well adjusted. 

5. Marked Improvement would mean that the 
patient was considered by the clinician to be 
substantially cured of the psychopathological con- 
dition for which he had come for help. Ratings of 
Some or Marked Improvement would be equiva- 
lent to a statement that some specific basic 
personality changes had occurred in the patient. 


These outcome ratings were considered as the 
change criteria, For the purposes of data analysis, 
Worse and No Change were considered No Change 
and Some Improvement and Marked Improvement 
were considered Change. 

The therapy evaluation scheme employed in ob- 
taining the outcome ratings was based by Baum- 
rind (1959) on construct validity. The scheme was 
particularly relevant to the frame of reference 
(evaluation of clinical appearance) of the Harvard 
psychiatrists whose ratings were requested. 

An important advantage in natural-setting research 
such as this is the fact that the raters know the 
patients over an extended period of time (3-100 
interviews), make ratings in the familiar terms of 
clinical appearance, and have the use of the case 
notes which they dictate after each therapy ses- 
sion, A limitation of Baumrind’s scheme was that 
interjudge reliability in rating was not reported, and 
it was not possible to obtain a reliability check as 
part of the present research, 


Data Analysis 


The Rorschachs were scored using the Klopfer, 
Kirtner, Wisham, and Baker (1951) system, with 
the exception that form level was evaluated accord- 
ing to the Hertz (1961) tables, Analysis of the data 
took three forms. The first was an application of 
approaches revealed by the related literature to 
have promise (Auld & Eron, 1959; Dana, 1954; 
Gibby, Stotsky, Hiler, & Miller, 1954; Gibby, Stotsky, 
Miller, & Hiler, 1953; Koltoy & Meadow, 1953). 
The second was the application of the Klopfer et al 
(1951) Rorschach Prognostic Rating Scale (RPRS) 
to the Rorschach protocols in the sample. The 
third consisted of using two discriminant functions 


in the existing Rorschach literature (Affleck & Med- 
nick, 1959; Gibby et al. 1954), as well as developing 
a discriminant function based on the sample in this 
study. 


RESULTS 


In analyzing the data from this study, the 
following four comparisons were routinely 
made: Short-Term (N = 30) versus Long- 
Term (N = 20); No Change (N = 10) ver- 
sus Change (N = 30); No Change (N = 10) 
versus Symptomatic Improvement (N =10); 
and Symptomatic Improvement (N = 10) 
versus Change (N = 10). When the Short- 
Term groups, including No Change, Sympto- 
matic Improvement, and Change, and the 
Long-Term groups, including No Change, 
Symptomatic Improvement, and Change, were 
established, the cells were found to be too 
small for meaningful statistical comparison, 
given the total sample size of 50. 

Tn several instances where important and 
statistically significant differences were found 
between these subgroups, the differences have 
been reported, but note has been made of the 
small cell size and the vulnerability to in- 
flated probabilities. Cronbach (1949) cau- 
tioned against the inflation of probabilities 
which results from extensive statistical tests 
in Rorschach studies. Cross-validation is 
particularly important with these findings. 
They are reported for the primary benefit of 
those doing future Rorschach therapy re- 
search with college students and are not for 
clinical application. 


Results from Related Literature 


Only the Dana (1954) study of responses 
to Card IV provided a dimension that signif- 
icantly (p< 05) differentiated the Short- 
Term and Long-Term therapy groups. The 
Long-Term therapy group was found to have 
given more adequate responses to Card IV 
(responses with neutral tone and good form). 
Application of the Gibby et al. (1953) ap- 
proach to this college student population re- 
vealed two significant differences. The Change 
group was higher than the No Change group 
($ < .05) on both Sum C and m. In addi- 
tion, there were several marked trends which 
did not reach significance. The Change group 
was higher on R, CF, Fk, H, Fc, M, and ade- 
quate responses to Card IV, The No Change 
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group was higher, again with differences not 
reaching significance, on A% and negative 
(hostility based) responses to Card IV (Dana, 
1954). The size of the range within the 
groups produced too much overlap for sig- 
nificant statistical differences to occur. 

Only R differentiated the Symptomatic Im- 
provement from the No Change group, with 
R being higher for Symptomatic Improvement 
(34.4 versus 25.1). There were no significant 
differences between the Change and Sympto- 
matic Improvement groups. 

Two interesting differences emerged from 
the Short-Term outcome and Long-Term 
outcome comparisons. The tentative nature 
of their basis should be underscored. First, 
the Short-Term No Change group was sig- 
nificantly lower (¢ = 2.10, p < .05) than the 
Long-Term Change group on CF. The Short- 
Term No Change mean was 1.16; the Long- 
Term Change mean was 4.00. Second, the 
Short-Term Change group was significantly 
higher (¢ = 2.05, p < .05) than the Long- 
Term Change group on Fk (means were 1.0 
and .28, respectively) and significantly lower 
(t = 2.87, p < .01) on adequate responses to 
Card IV (means were 1.52 and 3.28). 


RPRS Results 


Following the Kirtner, Wisham, and Giedt 
(1953) study, form level and color were not 
included as measures of potential ego strength 
on the RPRS. The movement (M, FM, m) 
scores were totaled in both raw and weighted 
form (Klopfer et al., 1951) to see if they 
might differentiate the therapy groups as an 
additional comparison. There were no dif- 
ferences on the RPRS between the No Change, 
Change, or Symptomatic Improvement groups. 
Shading raw scores differentiated between the 
Short-Term (2.14) and Long-Term (.96) 
therapy groups (p < .05). 


Discriminant Analysis Approaches 


The discriminant functions developed by 
Gibby et al. (1954) and Affleck and Mednick 
(1954) were applied to the subjects in our 
college student sample. Table 1 presents the 
comparisons of the therapy groups on both 
discriminant functions. As inspection of Table 
1 reveals, mean scores on the Gibby et al. 
(1954) equation significantly differentiated 


the No Change and Symptomatic Improve- 
ment groups ($ < .01) and the No Change 
and Change groups (p< .05). The Affleck 
and Mednick (1954) study did not prove 
effective on this sample in terms of predicting 
outcome. 

The Gibby et al. (1954) cutting score was 
.02. Applying this cutting score to the sample 
resulted in a 53% accuracy in Change versus 
No Change classification and 55% accuracy 
in a combined Change and Symptomatic Im- 
provement versus No Change classification. 

Using a cutting score of .015 markedly im- 
proved the predictive efficiency. With this 
new cutting score accuracy in Change versus 
No Change classification was 70% and accur- 
acy in Symptomatic Improvement Change 
versus No Change was 74%. 

The final discriminant function to be pre- 
sented was developed on this sample. The 
first discriminant analysis used 23 Rorschach 
variables and did not attempt to select among 
them in terms of their overall contribution. 
Using all 23 variables as predictors, 80% of 
the Short-Term therapy group, 85% of the 
Long-Term therapy group, 90% of the No 
Change group, 80% of the Change group, and 
78% of the Symptomatic Improvement group 
were correctly identified. Having so many pre- 
dictor variables, however, was unwieldly. 
The optimal number of variables consistent 
with reasonable accuracy was found to be 
six for the Short-Term-Long-Term prediction 
plus a constant. This resulted in correct 
identification of 47% of the Short-Term cases 
and 75% of the Long-Term cases, for an 
overall predictive efficiency of 58%. For the 


TABLE 1 


COMPARISON oF RORSCHACH MEAN DISCRIMINANT- 
FUNCTION SCORES FOR THERAPY GROUPS 


i Affleck and 
GiRiS a | Mednick 
(and (1954) 

M 

1, Short-Term 021 

Long-Term 020 

2. No Change 014 

Symptomatic Improvement | .022 

3. No Change 014 

Change £023 

4, Symptomatic Improvement | .022 

Change 023 
*p <.05 
ep <01 
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TABLE 2 
DiscrIMINANT-FUNCTION EQUATIONS For Use Wirth THERAPY LENGTH AND OUTCOME GROUPS 


Short-Term Therapy and Long-Term Therapy 


Short-Term F = 3.004 — .284m — .390K + .003M + .143H + 284.500 Gibby value + .015 Affleck value 


Long-Term Y = —3.420 — .331m — 512K — .602M + .429H + 327.75 Gibby value + .038 Affleck value 


No Change, Change, and Symptomatic Improvement 


No Change Y = — 14.688 — .798R + 2.391CF + .175 (FC-CF) + .403 A% + .103 F + % — 1.266 Sum C 
— 1.638 Fc — 1.891 m — 7.717 K + 3.809 M — 3.138 H + 2426.26 Gibby value — .167 Affleck value 


Change ¥ = — 16.656 — 1.503 R + 2.092 CF + .202 (FC-CF) + .405 A% + 110 F + % — .588 Sum C 
— 1,902 Fc — 3.098 m — 11.110 K + 5.793 M — 4.734 H + 3998.68 Gibby value — .257 Affleck value 


Symptomatic Improvement Y = — 14.844 + .468 R + 2.007 CF + .282 (FC-CF) + .355 A% + .096 F + % 
— 1.068 Sum C — 1.343 Fe + .773 m — 1.269 K + 1.223 M — .891 H 
— 329.625 Gibby value — .031 Affleck value 


Note.—Gibby et al. (1954) formula: Y = .00049 R + .00212 Total K 4.00010 m. Affleck et al. (1954) formula; Y = R 


+19.8 M — 18.6 H. 


No Change-Change-Symptomatic Improve- 
ment prediction, there were 13 variables plus 
a constant. This resulted in correct identi- 
fication of 60% of the No Change group, 62% 
of the Change group, and 67% of the Symp- 
tomatic Improvement group, for an overall 
predictive efficiency of 62%. 

Since these discriminant equations were de- 
veloped on this sample, they must be cross- 
validated. Table 2 presents the equations for 
use in cross-validational studies, The method 
of discriminant analysis used in this study 
followed Rao (1965). Each subject’s Ror- 
schach scores are placed in each discriminant 
function. The resulting highest value deter- 
mines classification instead of any specific 
cutting score. 


Discussion 


With the exception of one approach to dis- 
criminant analysis developed in this study, 
all the Rorschach usages were applications of 
techniques previously found effective with re- 
lated clinical problems, such as predicting 
therapy terminators, Almost without excep- 
tion, the previous applications had been with 
hospitalized or other clinic patients, 

The study conducted herein was, therefore, 
in no sense a replication of previous studies, 
This would have involved a precise repeat 
with the same clinical problem, methodology, 
and a comparable sample. Rather, the 
attempt was made to ascertain whether ap- 


proaches effective with related problems could 
be of assistance in predicting membership 
for a group of college students in short-term 
versus long-term therapy groups or no change 
versus change versus symptomatic improve- 
ment therapy groups. 

The discriminant-function approach was 
the most productive, with the exception of 
the Affleck and Mednick (1954) system. The 
Gibby et al. (1954) work was originally de- 
signed to identify terminators and continuers 
in therapy. The authors had been able to 
correctly classify 67% of their research sam- 
ple. In our study, using a t-test analysis of 
the discriminant scores, there was no differ- 
ence between Short-Term and Long-Term 
therapy groups, but significant differences 
between the No Change group and both 
Change (p < .05) and Symptomatic Improve- 
ment (p < .01) groups. 

The Gibby et al. (1954) cutting score of 
02 resulted in a 53% accuracy in Change ver- 
sus No Change classification. Using a .015 
cutting score improved the correct classifica- 
tions to 70%. When Change and Symptoma- 
tic Improvement groups were combined ver- 
sus No Change, the percentage of correct 
classifications was 74, but when every one 
was called a Change-Symptomatic Improve- 
ment, the base rate resulted in a 75% ac- 
curacy. The cutting score did not improve 
predictive efficiency. 

An important question, however, with col- 


A 
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lege students is, “Who will or will not bene- 
fit from therapy within the university facil- 
ities?” so that those who will not may 
either be referred to another clinical setting 
for more intensive help or encouraged to seek 
other means for obtaining help with personal 
problems. This is important particularly if 
there is a waiting list and treatment priorities 
must be established. 

In the present study, therefore, the accur- 
acy in identifying the No Change group pro- 
vides an important index of predictive effi- 
ciency. Meehl and Rosen (1955) have in- 
dicated the need to take base rates into 
account in determining the usefulness of 
predictive instruments. In this sample 60% 
were rated as changed, 20% as symptomati- 
cally improved, and 20% as not changed. 
With a cutoff score of .015, 90% of the No 
Change group, 30% of the Change group, 
and 10% of the Symptomatic Improvement 
group fell below the cutoff score. 

The components of the Gibby et al. (1954) 
discriminant function (R, total K, m) re- 
semble elements previously found to be re- 
lated to therapy outcome. Kirtner et al. 
(1953), for example, in discussing the favor- 
able dimensions of the RPRS, noted that 
“free floating anxiety (K, KF) . . . intro- 
spection (FK), ability to become aware of 
conflict (m) .. . are favorable personality 
characteristics for psychotherapeutic prog- 
ress [p. 469].” The number of Rorschach 
responses (R) has frequently been related 
to therapy, though more often in relation to 
termination-continuation (Auld & Eron, 
1959; Gibby et al., 1954; Koltov & Meadow, 
1953). Ris generally associated i in Rorschach 
interpretation with verbal productivity and 
expressiveness, certainly important aspects of 
response to psychotherapy. It would seem that 
what the Gibby et al. (1954) equation has 
done is to provide an effective weighting of 
the R, total K, and m scores. 

The final result to be discussed is the dis- 
criminant function developed on this sample 
using the Rao (1965) technique. With a 
relatively small sample and a necessity for 
cross-validation, this discriminant equation 
in Table 2 should only be used for research 
purposes at present. An important task for 


future research will be the attempt to reduce 
the number of predictor variables in subse- 
quent studies without markedly reducing pre- 
dictive efficiency. 
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DIGIT SYMBOL PERFORMANCE OF SUBJECTS VARYING IN 
ANXIETY AND DEFENSIVENESS 


MYRON BOOR ann THOMAS SCHILL 


Southern Illinois University 


Ss were given the Taylor Manifest Anxiety scale and the Marlowe- 
Crowne Social Desirability scale (M-C SD). High-anxious Ss were found to 
be primarily nondefensive (low M-C SD scores) compared to the low-anxious 
Ss, half of whom were defensive (high M-C SD scores) and half nondefensive. 
A significant difference on digit symbol performance favoring low-anxious Ss 
was found only when the defensive low-anxious Ss were eliminated from the 
analysis. It is suggested that equivocal findings in past research of this sort 
may have resulted in part from the failure to eliminate these defensive low- 
anxious Ss, who appear to experience manifest anxiety but refuse to admit it. 


A number of studies (Heineman, 1953; 
Kerrick, 1955; Matarazzo, 1955; Sarason, 
1956) have reported substantial negative cor- 
relations between the MMPI K scale and 
Taylor’s Manifest Anxiety (MA) scale. This 
suggests that although some subjects who 
score low on the MA may lack maladjustive 
symptoms, others may possess manifest anxi- 
ety but defensively deny that this is the case. 
Therefore, in order to use the scale most ef- 
fectively it would seem necessary to be able 
to differentiate these two types of low scorers. 

One procedure to accomplish this end might 
be to administer a social desirability (SD) 
scale in conjunction with the MA. Since per- 
sons who score high on SD scales are intent 
on presenting a defensive self-picture, they 
would not be expected to endorse many of 
the MA items, as this would mean admitting 
to problems and shortcomings. Such indi- 
viduals may be significantly minimizing their 
anxiety levels. On the other hand, persons 
with low scores on SD scales should probably 
describe themselves more accurately, since 
they lack this defensive response set. Thus, 
people who score high on the MA are freely 
and openly admitting to symptoms of anxiety 
and should therefore have relatively low SD 
scores. Subjects who score low on both scales 
would most likely be individuals who simply 
lack symptoms, since they appear to be de- 
scribing themselves in a relatively straight- 
forward, nondefensive manner. 

Kogan and Wallach (1964) have ap- 
proached this problem by using the Marlowe- 
Crowne Social Desirability scale (M-C SD; 


Crowne & Marlowe, 1960) as a measure of 
defensiveness in relationship with the Alpert- 
Haber Anxiety Scale in studying personality 
correlates of risk-taking behavior. They found 
that risk taking was related to anxiety only 
when defensiveness was taken into account. 
That the M-C SD measures defensiveness has 
been suggested by a number of people (e.g., 
Crowne & Marlowe, 1964, Part IIT). 

The present study attempted to investigate 
Wechsler Digit Symbol performance as a 
function of anxiety, as measured by the MA, 
and defensiveness, as measured by the M-C 
SD. Guertin, Rabin, Frank, and Ladd (1962) 
indicated that studies using a wide variety 
of subjects found no consistent relationship 
between the MA scale and Wechsler scores. 
The Guertin et al. (1962) references in- 
cluded studies dealing with MA and digit 
symbol performance. However, based on the 
considerations thus far mentioned it seemed 


TABLE 1 


SUMMARY oF Four ANXIETY Groups AND THEIR 
Mean MA and M-C SD Scores 


Male Female 
Group 
M-C M-C 
n | MA sp |” MA SD 
High-anxious 39 | 26.3 | 12.1 | 48 | 32.5 | 11.5 
Low-anxious | 62] 9.5 | 16.1 | 62 | 12.3 | 15.0 
Defensive, low- | 34 | 8.7 | 20.4 | 24 | 11.6 | 19.8 
anxious 

Nondefensive, | 28 | 10.6 | 10.8 | 38 | 12.7 | 12.0 

ow-anxious 
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possible that by eliminating defensive low- 
anxious subjects (those who supposedly pos- 
sess manifest anxiety but are reluctant to 
admit it) from a low-anxious group, differ- 
ences on digit symbol performance between 
this resultant low-anxious group and a high- 
anxious group should be enhanced. 


METHOD 
Subjects 


The Ss consisted of 159 male and 187 female 
undergraduates. All were given the MA and M-C 


Lo A (True) 


Mean Digit Symbols Completed 
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SD. The range of MA scores for males was 1-41 
(Mdn=16) and for females was 5-42 (Mdn = 23). 
The Ss were divided into high- and low-anxious 
groups on the basis of having scored in approxi- 
mately the upper fourth or lower third of the MA 
distribution. On the basis of normative data pre- 
sented by Crowne and Marlowe (1964), low-anxious 
Ss were further divided into defensive (M-C SD 
scores 16 and above) and nondefensive (M-C SD 
scores 15 and below) groups. The resulting groups 


1As expected, high-anxious Ss had much more 
homogeneous M-C SD scores than low-anxious Ss. 
Approximately 83% of high-anxious Ss scored below 
16, whereas about 53% of low-anxious Ss scored 


Lo A 
(All) 


Lo A 
(Defensive) 


Trials 


Fic. 1. Performance curves for the various anxiety groups on successive trials of digit 
symbol. 
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and their MA and M-C SD scores are summarized 
in Table 1. 


Procedure 


The Ss were tested at four large group sessions. 
Initially, they were given four consecutive 90-second 
trials of the WAIS Digit Symbol subtest, and 
standard Wechsler instructions were used. Subse- 
quently, Ss were given the MA, M-C SD, and a 
number of other measures (not relevant to this 
study), in that order. 


RESULTS 


Performance curves for the various anxiety 
groups on the Digit Symbol test are presented 
in Figure 1. These data were analyzed sepa- 
rately for males and females, The initial 
analysis compared the total number of digit 
symbols completed for high- and low-anxious 
subjects, irrespective of their M-C SD scores. 
These results were consistent with the previ- 
ous research referred to in the introduction, 
as there was no significant difference between 
the high- and low-anxious groups (males, £ 
= .73, df = 99; females, £ = 1.74, df = 108). 
In both cases p > .05, two-tailed. 

Next, the same analysis was performed, but 
this time the defensive low-anxious subjects 
were eliminated from the low-anxious group. 
In this case, the high-anxious subjects were 
found to be significantly poorer on digit 
symbol performance than the resultant non- 
defensive low-anxious groups (males, £= 
2.00, df = 65; females, £= 2.03, df = 84). 
In both cases, p < .05, two-tailed, 

Since it had been suggested that there 
might be a curvilinear relationship between 
anxiety and digit symbol performance (Mat- 
arazzo & Phillips, 1955), scores for the 
medium-anxious groups were taken into ac- 
count. However, for both males and females 
these scores fell between those of the high- 
anxious and nondefensive low-anxious groups, 
suggesting, instead, a linear relationship. 

Finally, a number of interesting sex dif- 
ferences were found. Females completed sig- 
nificantly more digit symbols (M = 275.72) 
over the four trials than males (M = 246.73). 
This difference was significant beyond the 
.001 level (¢ = 8.29, df = 344). Females also 
ee ee 


below 16. Correlations between the two scales were: 
males, r= —.35; females, r= —.31 (p< 01). 


had higher mean MA scores (females, M 
= 21.78; males, M = 16.21; t= 6.53; df= 
344; p < .001, two-tailed). 


Discussion 


The results clearly support the notion of 
two types of low-anxious responders on the 
MA. While high-anxious subjects tended to 
be primarily nondefensive, about half of the 
low-anxious subjects tended to be defensive 
and the other half nondefensive. Furthermore, 
although there was no significant difference 
on digit symbol performance between all 
high-anxious and all low-anxious subjects, 
when defensive low-anxious subjects were 
eliminated from the analysis a significant 
difference was found. 

Finding that the high-anxious subjects 
performed more poorly than the nondefensive 
low-anxious subjects is consistent with clini- 
cal notions that anxiety is disruptive. One 
can also conceptualize the data as conform- 
ing to the notion that competing responses, as 
are present on the digit symbol task, take 
longer to extinguish for high-drive subjects. 

The fact that previous studies relating MA 
scores and digit symbol performance have 
often found either no relationship or a sug- 
gested curvilinear relationship may be due in 
part to the confounding influence of the de- 
fensive low-anxious subjects. That the de- 
fensive low-anxious subjects had a higher 
level of anxiety than the nondefensive low- 
anxious subjects is suggested by the former’s 
poorer digit symbol performance, In fact, male 
defensive low-anxious subjects performed as 
poorly as the male high-anxious group. 

The data also lend some construct validity 
to the M-C SD as a measure of defensiveness. 
It would appear that this measure can be par- 
ticularly useful in differentiating defensive 
and nondefensive, or “true,” low scorers when 
used in conjunction with scales whose items 
deal primarily with pathological content. 

Finally, it was found that female college 
students performed better on digit symbol 
than males. Females also had higher MA 
scores than males. These latter two findings 
certainly should be considered when using 
digit symbol and when using the MA. This, 
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unfortunately, has not always been the case 
in previous research. 
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TEMPORAL EXPERIENCE IN DEPRESSIVE STATES 
AND SCHIZOPHRENIA 
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3 groups of Ss, 20 schizophrenics, 20 depressives, and 20 normals, were com- 
pared on measures of future time perspective (coherence and extension), time 
perception (short and long intervals), and time orientation, Schizophrenics and 
depressives differ from normals in time Perspective extension and coherence. 
Schizophrenics show less coherence than depressives, and depressives show a 
more curtailed future perspective than the schizophrenics. Both pathological 
groups estimate time less accurately and are less future oriented than normals, 


The phenomena of temporal experience 
have stimulated an increasing amount of 
psychological research in recent years. The 
clinician’s present interest in this area has 
developed because of its relevance to person- 
ality development and functioning, both nor- 
mal and abnormal. Studies concerned with 
this relationship between the experience of 
time and psychological functioning have 
dealt with the following three aspects of 
temporal experience: (a) The concept of 
time perception has been used in the study 
of the human capacity to estimate, repro- 
duce, or produce specific units of time. (b) 
The concept of time perspective has been 
based on the notion proposed by Wallace 
(1956) of “the timing and ordering of per- 
sonalized events.” While the theoretical in- 
terest here has concerned the projection of 
the self in the temporal dimension—past, 
present, and future—the empirical data ob- 
tained have mainly concerned the temporal 
dimension of the future; furthermore, the 
investigators studying this aspect of tem- 
poral experience have generally used mea- 
sures of “extension,” which refers to the 
amount of future time the individual can 
conceptualize, and “coherence,” which refers 
to the logical order imposed by the individual 
on elements of the time span. (c) Time ori- 
entation and time perspective have often 
been used interchangeably, but for the pur- 
pose of this study a slight distinction needs 
to be made. In this discussion, time orienta- 


1 This research was submitted as the master’s 
thesis of the senior author, Department of Psychol- 


ogy, Michigan State University, 1964. 


tion is defined as the direction or orientation 
—present, past, or future—of the person’s 
temporal experience, 

Several empirical accounts have related 
the above aspects of temporal experience to 
abnormal psychological functioning. Wallace 
and Rabin (1960) have provided an exten- 
sive review of the research in this area. Tem- 
poral disturbances in schizophrenia have re- 
ceived the most attention, and the evidence 
has generally indicated that the schizophren- 
ic’s ability to estimate time is disrupted 
(Rabin, 1957) and that his future time per- 
spective is curtailed (Wallace, 1956). While 
some investigators have also looked at the 
temporal disturbance in depressive states, 
their accounts have been mainly speculative 
and descriptive. Since the investigation of 
temporal experience in various nosological 
groups may provide valuable information for 
the development of theories of personality 
and diagnostic techniques, the present re- 
search has attempted to systematically com- 
pare the temporal experience of patients ex- 
hibiting depressive states, of schizophrenic 
patients, and of a normal control group. In 
general, it was predicted that the process of 
depression has an observable effect on tem- 
poral experience that is different from that 
observed in schizophrenia and that both 
pathological groups differ from the normals 
in the experience of time. Three specific hy- 
potheses were proposed: 


1. The depressive group, the schizophrenic 
group, and the normal control group differ sig- 
nificantly in the extension and coherence of fu- 
ture time perspective. 
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2, The depressive group and the schizophrenic 
group make less accurate estimates of the passage 
of time than the normal control group. 

3. The depressive group and the schizophrenic 
group are less frequently future oriented than the 
normal control group. 


METHOD 
Subjects 


The total sample of 60 Ss was divided into three 
groups: a schizophrenic group, a depressive group 
composed of 6 Ss diagnosed as psychotic depressive 
and 14 Ss diagnosed as psychoneurotic depressive, 
and a normal control group. The 20 schizophrenic 
Ss and the 20 depressive Ss were patients in the 
psychiatric service of the Detroit Receiving Hos- 
pital. At the time of the study, the psychiatric 
service at this hospital was primarily a receiving 
service where patients remained for only a short 
time before being transferred to other psychiatric 
services in the state if extensive hospitalization and 
treatment were indicated. Psychiatric Ss reportedly 
had not been previously hospitalized for a psychi- 
atric disorder. They were interviewed by E during 
the first week of their hospitalization and were not 
receiving ECS or drug therapy. Each S had been 
extensively interviewed by a psychiatric resident, 
who generally determined the diagnosis. The case 
studies of many Ss had also been presented at staff 
meetings, and the diagnosis was considered by the 
entire psychiatric staff. The normal control group 
included 20 patients in the medical service of the 
same hospital, This final group of Ss presented a 
wide range of medical problems, but none of them 
had received surgical treatment. Moreover, it had 
been ascertained by interviewing the patients and 
by consulting the medical staff that there was no 
evidence of any significant psychopathological dis- 
turbance accompanying the patient’s illness. 

Since previous research has noted that the varia- 
bles of sex, age, educational level, and intellectual 
level have some effect on temporal experience, the 
following controls were provided in this study: 

Sex. An equal number of males and females were 
included in each of the three groups. 

Age. Only Ss 20-50 years of age were included in 
the sample. Using an analysis of variance statistical 
test, the investigators found that the mean ages of 
35.40 for depressives, 33.90 for schizophrenics, and 
32.40 for normals were not statistically different. 

Educational level. Only Ss who had received 8-14 
years of normal schooling were included. Again it 
was found that the mean educational levels of 10.60 
for depressives, 10.95 for schizophrenics, and 11.00 
for normals were not statistically different. 

Intelligence level. Only Ss who had received raw 
scores 30-60 on the Wechsler Vocabulary subscale 
were included in the sample. Again the mean 
Wechsler Vocabulary scores of 36.55 for depressives, 
36.50 for schizophrenics, and 34.85 for normals were 


not statistically different. 
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Instruments 


Measures of future time perspective. The first 
experimental task was used to obtain a measure of 
the extension and the coherence of S’s future time 
perspective. Ten common life events that would 
be likely to occur in the lifetime of an American 
male or female were read to Ss, Each S was asked 
to tell how old he might be when the event might 
happen to him. A measure of extension was ob- 
tained by finding the range of years between the 
S’s age and the most distant future event given 
by him. After completion of the fourth task, S was 
presented with 10 cards, each with 1 of the 10 
events typed on it; S was asked to arrange 
the cards in the order that the events might hap- 
pen in his life. Accordingly, the measure of coher- 
ence was obtained by finding the correlation be- 
tween the ordinal ranking of the events with regard 
to the age of occurrence obtained in the first part 
of this task and S’s arrangement of events obtained 
in the second part of the task. This technique is a 
modification of one used by Wallace (1956). 

A second task was also used to obtain a measure 
of the extension of S’s future time perspective. 
This task consisted of a story-completion technique 
which was originally developed by Barndt and 
Johnson (1955). The beginning statement of four 
stories was read by E to each S; S finished the 
story, and his response was recorded, The S also 
indicated the length of time involved in the action 
described in his story. The length of time given by 
S was used as a measure of “extension.” 

Measure of time perception. The third task pro- 
vided a measure of S’s ability to estimate the pas- 
sage of time. At the midpoint of the experimental 
session (Mdn = 14 minutes) and again at the end of 
the session (Mdn = 31 minutes), S was simply asked 
to estimate how much time had passed since he 
and E had begun talking. Accuracy of the estima- 
tion of the period of time was based on the ratio 
of the estimated time to the actual time recorded 
by Æ. This ratio was then expressed in a percentage, 
and the percentages were categorized as follows: 
(a) overestimation, 120% and above; (b) under- 
estimation, 80% and below; (c) close estimation, 
81%-119%. 

Measure of time orientation. Four TAT cards 
were employed in the fourth task in order to deter- 
mine S’s time orientation. The technique stemmed 
from the work of LeShan (1952). The Ss were pre- 
sented TAT Cards 2, 4, 6BM, and 7BM, all of the 
cards representing interpersonal situations, Instead of 
using the usual instructions for the administration 
of this projective test, Æ asked S merely to tell a 
story about the picture. The E rated each story 
as either past oriented, present oriented, or future 
oriented. The following score values were correspond- 
ingly assigned to each story: past oriented, scoring 
value of 1; present oriented, scoring value of 2; 
future oriented, scoring value of 3. By summing 
the score values for the four stories, a measure of 
time orientation was obtained for each S. Another 
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TABLE 1 
COMPARISON OF MEDIANS AND SUM OF RANKS OF EXTENSION SCORES ON Two EXPERIMENTAL TASKS 
FOR DEPRESSIVES, SCHIZOPHRENICS, AND NORMALS 
Depressive Schizophrenic Normal 
Item H 
Mdn SR Mdn SR Mdn SR 
Task 1 27 yr. 390.0 35 yr. 617.5 49 yr. 822.5 1535288", 
Task 2 
Story 1 2 hr. 530.0 1.5 hr. 541.5 3hr. 758.5 5.40 
Story 2 1 hr. 531.5 2 br. 534.0 3 hr. 764.5 5.87 
Story 3 45 min. 437.5 4hr. 618.5 4.5 days 774.0 9.30** 
Story 4 2 days 411.5 7 mo. 640.7 2 yr. 778.5 11.26** 
Note.—Abbreviated: Mdn = median; SR = sum of ranks. 
* p <01. 
wk p < 001. 


judge also rated each story, and an interjudge reli- 
ability of r=.77 was found. 


RESULTS 


Consideration of the nature of the distri- 
bution of scores obtained from the experi- 
mental tasks suggested that the assumption 
of normality was questionable. Therefore, 
nonparametric techniques were employed in 
the statistical tests of significance. The 
Kruskal-Wallis test appeared to be the most 
powerful of the nonparametric tests for k- 
independent samples and, therefore, provided 
a useful alternative for the parametric analy- 
sis of variance in the statistical analysis of 
the scores obtained in Tasks 1 and 2. Since 
the Kruskal-Wallis test calls for ordinal 
measurement, it could not be used for the 
statistical analysis of the scores obtained in 
Tasks 3 and 4, where the data consisted of 
frequencies in discrete categories. The non- 
parametric technique that appeared the most 
appropriate for these tasks was the chi-square 
test for independent samples. 


Future Time Perspective 


An overall analysis of the differences be- 
tween the three groups in regard to the scores 
derived from Task 1 produced a significant 
H score of 15.35 (p < .001). The data, there- 
fore, were consistent with the prediction that 
the three groups differ significantly in regard 
to the extension of future time perspective. 
Furthermore, the data indicated that the 
lowest median extension score was obtained 


by the depressive group, while the highest 
was obtained by the normal control group. 
Significant differences between the three 
groups in regard to the extension scores were 
also found for Stories 3 and 4 of Task 2. 
Again the data indicated that the median 
extension scores of the depressive group were 
the lowest, while those of the normal control 
group were the highest. The extension scores 
derived from Stories 1 and 2 did not differen- 
tiate between the three groups at the .05 
level of confidence; nevertheless, the same 
general trend was still evident. 

An analysis of the coherence scores pro- 
duced a significant H score of 15.49 (p< 
.001). It can thus be concluded that there 
are significant differences between the co- 
herence of future time perspective for the 
three groups. From the data presented in 
Table 2, the direction of these differences can 
be noted. The schizophrenic group presented 
the least coherent future time perspective; 
on the other hand, the depressive group was 
somewhat more coherent, while the normal 
control group gave the highest rank correla- 
tion coefficient. 


Time Perception 


No significant differences were found be- 
tween the three groups of subjects in the 
ability to estimate time intervals when the 
median duration was 14 minutes. Neverthe- 
less, the data presented in Table 3 indicate 
that the normal control group was somewhat 
more accurate in its estimates than the depres- 
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TEMPORAL EXPERIENCE IN DEPRESSIVE STATES AND SCHIZOPHRENIA 


TABLE 2 


COMPARISON OF MEDIANS AND SUM OF RANKS OF 
COHERENCE SCORES ON EXPERIMENTAL TASK 
FOR DEPRESSIVES, SCHIZOPHRENICS, 

AND NORMALS 


Group Mdn SR 
Depressive -63 558.5 
Schizophrenic 49 423.0 
Normal 83 848.5 
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TABLE 4 


CATEGORIES OF TIME ORIENTATION GIVEN BY 
DEPRESSIVES, SCHIZOPHRENICS, AND 
Normats on TAT STORIES 


Category D N S N 
Past 3 0 2 0 
Present 17 11 17 11 
Future 0 9 1 9 

x PAA ag 6.53* 


Note,—Abbreviated: Mdn = median; SR = sum of ranks, 
H = 15.49 (p < .001). 


sive and schizophrenic groups. A significant 
chi-square of 11.11 (p < .025) was obtained 
from the subjects’ estimation of the second 
time interval, the median duration of which 
was 31 minutes. Therefore, it appears that 
depressives, schizophrenics, and normals dif- 
fer in their ability to estimate the passage 
of time, but only when longer time intervals 
are involved. The data presented in Table 3 
suggest that the normal group was the most 
accurate in its estimates of the second time 
interval, while the majority of schizophrenic 
and depressive subjects made poor judgments. 


Time Orientation 


An overall chi-square comparison between 
the three groups with regard to the three 
categories of time orientation was not ap- 
propriate, since the expected values in six 
of the cells were less than five. While the 
logic of the theoretical interest would permit 
the combining of the past and the present 
categories, the expectations in some of the 
cells were still small. Therefore, instead of 
an overall analysis between the three groups, 


TABLE 3 


CATEGORIES OF ESTIMATION OF Two TIME INTERVALS 
BY DEPRESSIVES, SCHIZOPHRENICS, AND NORMALS 


First interval* Second interval> 


Estimation 
D S N D S N 
Under 6 5 3 5 8 4 
Over 10 10 8 11 8 4 
Close 4 5 9 4 4 12 


Note.—Abbreviated: D = depressive; S = schizophrenic; 
= normal. 

2 Mdn = 14 min. x? = 3.62. 

b Mdn = 31 min. x? = 11.11; p < .025. 


„Note.—Present and Past categories were combined for the 
statistical analysis. Abbreviated: D = depressive; S = schiz- 
ophrenic; N = normal. 

*p < 02. 

ED < .005. 


two individual comparisons appeared to be 
the most appropriate test of the hypothesis. 
In a comparison between the depressive group 
and the normal group a chi-square of 9.18 
was obtained (p < .005). A second compari- 
son indicated that the schizophrenic group 
and the normal control group differed in time 
orientation (p < .02). The Yates correction 
(Walker & Lev, 1953) was used in both com- 
parisons. The data in Table 4 indicated that 
the normal control group was more future 
oriented than the two psychopathological 


groups. 
Additional Findings 


A comparison of the extension scores ob- 
tained from the psychoneurotic depressives 
and the psychotic depressives revealed no 
statistically significant intragroup differences. 
A comparison of the coherence scores obtained 
from these subsamples, however, revealed 
statistically significant intragroup, differences 
(p < .05). This finding suggests that the total 
sample of depressive subjects did not repre- 
sent a homogeneous group with regard to the 
coherence of future time perspective. None 
of the differences between these two subgroups 
on the remaining tasks was statistically sig- 
nificant. 

Discussion 


The findings presented generally support 
the hypothesis that temporal experience is 
significantly affected by psychopathological 
disturbances. Furthermore, the results seem 
to indicate that depression has, in many cases, 
a different effect on temporal experience than 
does schizophrenia. 
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The data concerning extension indicate 
that the groups can be differentiated from 
each other in regard to the length of time 
used in conceptualizing future events. The 
curtailing of this time span in schizophrenia 
is in agreement with Wallace’s (1956) find- 
ings. In regard to the depressive group, there 
was evidence of an even more severely limited 
future time span, which supports Strauss’s 
(1947) contention that the “road into the 
future has become blocked in depressive states 
[p. 155]. Stories 1 and 2 of Task 2, however, 
did not significantly differentiate between the 
groups, because the content of these two 
stories limited the possible future courses of 
action and presented a relatively circum- 
scribed temporal set. Story 1, for example, 
begins: “At three o’clock one sunny after- 
noon in May two men were out walking near 
the edge of town. . . .” Both stories are 
structured in terms of relatively immediate 
action within narrow temporal bounds. On 
the other hand, the content of Stories 3 and 
4, which deal mostly with the hero’s thought 
(e.g., “He’s thinking of the time when. . .”), 
provided an almost unlimited number of pos- 
sible future courses of action. Here the de- 
pressives appeared to be most affected. 

The measure of coherence also differenti- 
ated between the groups. The difficulty exhib- 
ited by schizophrenics in organizing future 
events in a logical manner is compatible with 
Bellak’s (1958) observation that disturbances 
in secondary-process thinking are an im- 
portant aspect of schizophrenic symptomatol- 
ogy. Despite the sluggishness and retardation 
in depression, secondary-process disturbance 
is not as great as in schizophrenia. This may 
account for the greater coherence in depres- 
sives, although the psychotic depressives were 
less coherent than the neurotic depressives. 

Results obtained with Task 3 indicated 
that both depression and schizophrenia affect 
the ability to assess long periods of time. The 
findings with schizophrenia, which show both 
over- and underestimation, are in agreement 
with Rabin’s (1957) results. More frequent 
overestimation is noted in depression, and 
this agrees with the report by Mezey and 
Cohen (1961): “Every hour seems a year 
to me”; “It [time] is terribly slow—intermin- 
able [p. 269].” The shorter time interval did 
not yield significant differences. This is con- 


sistent with Dobson’s (1954) findings. The 
reports by Dobson (1954) and Rabin (1957), 
as well as the present findings, suggest that 
psychopathological disturbances have a 
greater effect on the estimation of longer time 
intervals than on briefer ones. The judgmental 
process is more involved in estimation of lar- 
ger intervals—a view consistent with that of 
Woodrow (1951)—and is most vulnerable to 
effects of psychopathology. The question 
arises; How long does a time interval need 
to be before psychopathology affects the in- 
dividual’s estimation of duration? 

Consonant with Arieti’s (1947) view con- 
cerning restriction of the psychotemporal field 
in schizophrenia and Strauss’s (1947) sug- 
gestion that the future loses its meaning as 
a harbinger for prospective solution in the de- 
pressed, results obtained on Task 4 indicated 
that both depressives and schizophenics are 
less future oriented than the normal control 


group. 
REFERENCES 


Artetr, S. The processes of expectation and antici- 
pation. Journal of Nervous and Mental Disease, 
1947, 106, 471-481. 

Berrak, L. (Ed.) Schizophrenia: A review of the 
syndrome. New York: Logos Press, 1958. 

Barnpt, R. J., & Jonson, D. M. Time orientation 
in delinquents. Journal of Abnormal and Social 
Psychology, 1955, 51, 343-345, 

Dosson, W. R. An investigation of various factors 
involved in time perception as manifested by 
different nosological groups. Journal of Genetic 
Psychology, 1954, 50, 277-298. 

LeSuan, L, L. Time orientation and social class. 
Journal of Abnormal and Social Psychology, 1952, 
50, 277-298. 

Mezey, A., & Comen, S. The effect of depressive 
illness on time judgment and time experience. 
Journal of Neurology, Neurosurgery, and Psychia- 
try, 1961, 24, 269-270. 

Rasiy, A. I. Time estimation of schizophrenics and 
nonpsychotics. Journal of Clinical Psychology, 
1957, 13, 88-90. 

Srrauss, E. W. Disorders of personal time in depres- 
sives states. Southern Medical Journal, 1947, 40, 
154-158. 

Watxer, H. M., & Lev, J. Statistical inference. 
New York: Holt, 1953, 

Wattace, M. Future time perspective in schizo- 
phrenia. Journal of Abnormal and Social Psychol- 
ogy, 1956, 52, 240-245. 

Warrace, M., & Rasty, A. I. Temporal experience. 
Psychological Bulletin, 1960, 57, 213-236. 

Wooprow, H. Time perception. In S. S. Stevens 
(Ed.), Handbook of experimental psychology. New 
York: Wiley, 1951. Pp. 1224-1236. 


(Received February 28, 1967) 


Journal of Consulting Psychology 
1967, Vol. 31, No. 6, 609-613 


RELIABILITY AND VALIDITY OF INTERNAL-EXTERNAL 
CONTROL AS A PERSONALITY DIMENSION + 


PAUL D. HERSCH 


University of Connecticut 


AND 


KARL E. SCHEIBE 


Wesleyan University 


Extensive new data are reported on the test-retest reliabilities and personality 
scale correlates of the internal-external control dimension (I-E). I-E is found 
to relate consistently to measures of maladjustment, with internal scorers less 
maladjusted. I-E is consistently related to a variety of personality scales, 
with internal scorers describing themselves as more active, striving, achieving, 
powerful, independent, and effective. For 2 of 3 samples, internal scorers 
were also significantly more effective as mental hospital volunteers than ex- 
ternal scorers. These results are consistent with those reported in previous 
reviews, but adjectival descriptions of extreme scorers, as well as other data, 
suggest that internal scorers are a more homogeneous group than external 
scorers, Suggestions are offered for differentiation of the concept of externality. 


The internal-external control dimension 
(I-E), as derived from social learning theory 
(Rotter, 1954), posits two characteristic 
“world views” or generalized expectancies 
concerning reinforcement. Based on past ex- 
perience, one group of individuals acquires 
the view that the locus of causality for per- 
sonality-relevant events, or reinforcements, is 
external. Others view events as products of 
their own actions, capacities, or traits. Thus, 
individuals are conceived to vary along a 
“locus of control” dimension, with the end 
points labeled internal and external. 

In recent years there has been substantial 
research on I-E, and the findings were re- 
viewed in articles by Lefcourt (1966) and 
Rotter (1966). Based on these reviews, the 
following generalizations are pertinent to the 
present study: (a) The test-retest reliability 
of the 29-item scale of I-E developed by 
Rotter is consistent and acceptable, varying 
between .49 and .83 for varying samples and 
intervening time periods. (b) Relationships 
to measures of intelligence have generally 
been nil (e.g., Cardi, 1962; Ladwig, 1963). 
There is, however, some evidence that the re- 


1The research here reported was completed as a 
part of the Connecticut Service Corps Research 
Project and was supported, in part, by the Connec- 
ticut State Department of Mental Health and Grant 
MH02127-02 from the National Institute of Mental 
Health. The Connecticut Service Corps consists of 
approximately 100 students recruited for summer 
work with chronic mental patients. The authors are 
indebted to James A. Kulik for suggestions and help 
on the data analyses reported in this paper. 


lationship may be weakly negative, with in- 
ternal scorers higher in intelligence. (c) Re- 
lationships to a measure of maladjustment 
(Incomplete Sentences Blank, ISB; Rotter 
& Rafferty, 1950) are perhaps curvilinear, 
perhaps nil, perhaps quite complex (Rotter, 
1966). (d) The I-E scale has been useful in 
some cases for behavioral predictions, such 
as discriminating between social action vol- 
unteers and nonvolunteers (Gore & Rotter, 
1963). 

It is the purpose of this article to supply 
further evidence on these and other proper- 
ties of the I-E dimension. In particular, the 
objectives of the present report are to: 


1. Report extensive new normative and relia- 
bility data for college males and females. 

2, Explicate the relation of I-E to maladjust- 
ment by describing its relation to a variety of 
indexes in addition to the ISB. 

3. Enlarge the meaning of I-E by describing 
its relation to the personality characteristics 
tapped by the California Psychological Inventory 
(CPI; Gough, 1964) and the Adjective Check 
List (ACL; Gough & Heilbrun, 1965). In addi- 
tion, adjectival self-descriptions of extreme scorers 
on I-E are presented. 

4. Test the prediction that I-E will be sys- 
tematically related to the effective performance 
of college students as volunteers on chronic men- 
tal hospital wards. The theoretical meaning of the 
I-E dimension leads to the prediction that the 
more internal subjects will be more effective in 
working with chronic mental patients. Internal 
subjects should have a greater expectation of 
positive change on the mental ward as a result of 
their own efforts and, hence, exert greater efforts 
to bring about those changes than externals, who 
should feel relatively powerless. 
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METHOD 
Subjects 


The Ss were members of the 1964 Connecticut 
Service Corps (SC), the 1965 SC, the 1965 control 
group (CG), the 1966 SC, and the 1966 CG.? The 
numbers and sex distributions of all groups are in- 
dicated in Table 1. The SC, for all 3 years, con- 
sisted of college volunteers who spent 8 weeks 
working on selected chronic wards of Connecticut’s 
four state mental institutions. 

The 1965 and 1966 CG Ss consisted of groups of 
college students attending summer school who were 
comparable in age, education, and marital status to 
SC Ss, 


Procedures and Materials 


As part of a more general program to assess the 
impact of the SC on students, patients, and the 
hospital milieu, a battery of tests and questionnaires 
was administered to the SC students in group ses- 
sions during the first week after arrival at the 
hospital. The test battery was about 4 hours in 
length. In all years, a revised battery of similar 
length was administered in the last week of the 
program, a retest interval of about 7 weeks. The 
1965 and 1966 CG Ss were administered an identi- 
cal pre- and postprogram test battery with a com- 
parable retest interval. 

The 29-item I-E scale is keyed so that high 
scores indicate a more external orientation. Six 
buffer items are included, so that the maximum E 
score is 23. Other indexes included in this report 
are the following: 


1. Measures of intelligence: (a) Otis Quick- 
Scoring Mental Ability Test (Otis, 1954), given 
only in 1964; (b) Terman Concept Mastery Test 
(CMT; Terman, 1956), given only in 1965; and 
(c) D48 (Black, 1962), given only in 1966. 

2. Indicators of maladjustment: (a) ISB, given 


in 1964 and 1965 only; (b) Pt scale of the MMPI * 


(Hathaway & McKinley, 1951), administered sep- 
arately from the remainder of the MMPI and 
given only in 1964; and (c) d-statistic, the dis- 
crepancy between self- and ideal-self-descriptions 
as computed for each of the 24 ACL scales using 
the formula, d= V3(Xi— Xz)", where Xi and X: 
are T scores for corresponding scales of the self- 
and ideal-self-descriptions for a given S—com- 
puted in 1965 and 1966, 

3. Other personality correlates: (a) 24 ACL 
scales and (b) 18 CPI scales. 

4. SC effectiveness measures: At the end of the 
summer, each SC student was assigned an effec- 
tiveness rating based on the combined ratings of 
Peers and supervisors, using eight 7-point scales. 


2 The writers wish to express their appreciati 

j; J preciation to 
Richard J. Wiseman, director of the Connecticut 
Service Corps, whose willing cooperation made the 
data collection possible, 


RESULTS 
Normative Data 


Table 1 presents the means and standard 
deviations for all groups. In every case, the 
postprogram mean was lower than the cor- 
responding preprogram mean, indicating a 
retest shift toward internality. No consistent 
sex differences were apparent in means or 
standard deviations. 

The mean test-retest change scores (Table 
1) ranged from —.11 to —1.33. A test for 
the significance of the difference between the 
mean change scores of the 1965 SC subjects 
versus the 1965 CG subjects yielded a non- 
significant result (¢ = .56). A similar test for 
the 1966 groups yielded a ¢ ratio of 1.17, also 
nonsignificant. Apparently the SC experience, 
as opposed to the various summer experiences 
of the CG subjects, did not significantly af- 
fect I-E change scores over a 2-month in- 
terval. In view of this similarity and the 
similarity in means and standard deviations 
for SC and CG subjects, these groups were 
combined in several subsequent analyses. 

Test-retest reliability coefficients (Table 
2) ranged .43-.84 for the 2-month interval 
groups. A group of 18 students who partici- 
pated in both the 1964 and 1965 SC exhib- 
ited a reliability coefficient of .72, based on 


TABLE 1 
Means, STANDARD DEVIATIONS, AND CHANGE SCORES 
Preprogram Postprogram 

M 
Sample 

change 

N| MJ|SD| N| M1] SD 

1964 SC 
F 72 | 7.92 |3.84| 71 |7.79 | 4.41 | —.17 
M 27 | 8.00| 3.97] 27 |7.41 | 3.46 | —.59 
1965 SC 
F 68 | 8.26 | 3.49 | 67 | 7.15 | 3.77 | —1.12 
M 34 | 8.00 | 3.08 | 32 | 7.00 | 4.05 | — 1.03 
1965 CG 
E 46 |9.37 | 3.76 | 44 |8.05 | 4.14 | —1.32 
M 49 | 8.67 | 3.89 | 46 | 7.26 | 4.08 | —1.33 
1966 SC 
is 79 |9.54) 4.20] 77 | 8.75} 4.43] —.79 
M 21 | 7.38 | 4.73 | 20 | 6.20} 3.71 | —1.18 
1966 CG 
F 47 | 8.79 | 3.76| 47 | 8.68 | 3.56] —.11 
M 38 | 8.84 | 3.70) 35 | 8.14 | 4.63 | —.70 


Note.—Abbreviated: SC = Service Corps; CG = control 
group; F = female; M = male. 
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TABLE 2 
TEST-RETEST RELIABILITIES 


Sample Sex N r 

1964 SC F 71 43 
M 27 AS 

1965 SC F 67 62 
M 32 74 

1965 CG F H 84 
M 46 67 

Returnees 1964-1965"| M + F 18 72 
1966 SC F 76 .70 
M 20 82 

1966 CG F 47 -70 
M 35 65 


Note.—All tests were group administered. Abbreviated: 
SC = Service Corps; CG = control group. 
* Represents the correlation between ZI-E for both years. 


the correlation of their total 1964 I-E score 
(sum of pre- and postprogram scores) with 
their total 1965 I-E score. This figure repre- 
sents test-retest reliability over approximately 
1 year. In summary, the mean scores, change 
scores, and test-retest reliabilities are con- 
sistent with those reported by Rotter (1966). 

In the following analyses, the sum of pre- 
and postprogram I-E scores (3I-E) will be 
employed. The reliability of this total score 
for the various groups falls between .60 and 
.91, estimated by the Spearman-Brown form- 
ula. The correlation of total I-E scores with 
three different measures of intelligence (Otis, 
CMT, D48) ranges from —.07 to .17. None 
of these correlations is significant. š 


Indexes of Maladjustment 


The possible relationship between I-E and 
maladjustment can be examined in terms of 
several indexes. The correlations between 
total I-E and ISB scores were .19 for females 
and .05 for males (pre- and postprogram 
scores combined across years for each sex). 
These correlations were not significantly dif- 
ferent. The r for sexes combined was .14, 
which differs significantly (p < .05) from 
zero. 

The correlations between I-E scores and 
anxiety, as measured by the Pt scale of the 
MMPI, were significant and positive. For the 
1964 SC subjects, the postprogram adminis- 


tration of the Pt scale correlated .26 for com- 
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bined sexes; essentially the same relationship 
held for males and females for the prepro- 
gram administrations of the Pt scale. 

The discrepancy between self- and ideal- 
self-description is another indication of mal- 
adjustment (Rogers, 1951). The correlation 
of the d-statistic, based on preprogram ACLs, 
and I-E was .21 for both the 1965 and the 
1966 samples. No appreciable sex differences 
or differences between pre- and postprogram 
administrations of the ACL were found. 


Personality Correlates 


Table 3 presents the correlations of the 
total I-E score with the 24 ACL self scales 
and 18 CPI scales. In no case did important 
sex differences appear in these correlations. 
Hence, Table 3 contains only correlations for 
the combined group. It was also the case 
that correlations of the I-E variable with pre- 
and postprogram test scores were very simi- 


TABLE 3 


CORRELATIONS OF ZI-E wirn CPI AND 
ACL SELF SCALES 


CPI scale* r ACL scale» r 
Dominance —.25** | No. checked —.06 
Capacity for —.22** | Defensiveness —.21** 

Status No. favorable =.19%* 
Sociability —.25**| No. unfavorable} .18** 
Social Presence | —.20**| Self-Confidence | —.18** 
Self-Acceptance | —.17** | Self-Control =.05 
Well-Being —.32** | Lability —.08 
Responsibility —.28** | Adjustment =.10* 
Socialization —.10* | Achievement —.24%* 
Self-Control —.20** | Dominance =.25"* 
Tolerance —.31** | Endurance =323%" 
Good Impression | —.25** | Order alo 
Communality 01 Intraception -.11* 
Achievement via | —.29** | Nurturance —.10* 

Conformance Affiliation —,12* 
Achievement via | —.13** | Heterosexuality | —.05 

Independence Exhibition —.11* 
Intellectual Effi- | —.28** | Autonomy —.09 

ciency Aggression —.00 
Psychological- —.18** | Change —.01 

Mindedness Succorance 123" 
Flexibility .11* | Abasement .26** 
Femininity .09 Deference 11* 

Counseling 11* 
Readiness 


Note.—Males and females combined for all 3 years on the 
pretest only. 
aN = 446. 
bN = 448, 
* p < .05, two-tailed test. 
*+p < .01, two-tailed test. 
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lar. For the 1965 sample no CPI correlation 
with the I-E varied more than .05 from pre- 
to postprogram. Slightly larger differences 
appeared in the pre-post correlations for the 
ACL. However, in no case were consistently 
significant changes noted from pre- to post- 
program. Consequently, in Table 3 only pre- 
program correlations are listed. Also, since 
the correlations are strikingly similar across 
the three annual samples, the data were 
pooled over years and a single r computed for 
each CPI and ACL self scale. 

As might be predicted, the internal scorer 
seems to be best characterized as high on 
the ACL measures of Defensiveness, Achieve- 
ment, Dominance, Endurance, and Order. On 
the other hand, the internal scorer is lower 
on the ACL scales reflecting Succorance and 
Abasement. On the CPI the internal scorer 
is higher on the Dominance, Tolerance, Good 
Impression, Sociability, Intellectual Effi- 
ciency, Achievement via Conformance, and 
Well-Being scales. The converse of these re- 
lationships may be said to hold for the ex- 
ternal scorer, who also checks fewer favorable 
and more unfavorable self-descriptive adjec- 
tives than does the internal. 

Many of the other scales reveal correla- 
tions in a theoretically consistent direction— 
for example, Personal Adjustment, Deference, 
and Counseling Readiness on the ACL and 
Self-Acceptance on the CPI. 

To further clarify the picture of I-E per- 
sonality distinctions, the 26 most internal 
subjects (with a total I-E score of 7 or less) 
were compared to the 26 most external sub- 
jects (with a total I-E score of 26 or more) 
on the 300 adjectives of the ACL. These cri- 
terion subjects were drawn from the 1964— 
1965 subject pool and represent approxi- 
mately 20% of that group. The internals had 
a mean total score of 5.23 compared to the 
group mean for externals of 29.35. Chi-square 
analyses were performed for each of the 
300 adjectives, 

‘Twenty-three adjectives were checked sig- 
nificantly more often by the internal indi- 
vidual (p< .05) and present a fairly co- 
herent description of him, at least as he sees 
himself. The adjectives more frequently 
checked by internals were: clever, efficient, 
egotistical, enthusiastic, independent, self- 


confident, ambitious, assertive, boastful, con- 
ceited, conscientious, deliberate, persevering, 
clear-thinking, dependable, determined, hard- 
headed, industrious, ingenious, insightful, or- 
ganized, reasonable, and stubborn. On the 
other hand, only one adjective was checked 
significantly more often by the externals— 
self-pitying. 


SC Effectiveness 


Finally, an examination of the correlation 
between total I-E scores and SC effectiveness 
revealed consistent results for 1964 and 1965, 
but an insignificant relationship for 1966. No 


significant sex differences were found. For’ 


males and females combined, the 1964 corre- 
lation was —.20 and the 1965 correlation 
was —.37, both significantly negative. The 
1966 correlation did not differ significantly 
from zero. The significant correlations were 
in the predicted direction and indicate that 
for the 1964 and 1965 groups, individuals 
rated most effective were individuals who 
viewed reinforcements as contingent upon 
behavior—in short, the internally oriented 
individual. The distribution of I-E for the 
1966 group was more variable and somewhat 
more elevated than the previous distribu- 
tions, and it is possible that some distortion 
emerges with this increased range. However, 
there is no satisfactory explanation of the 
inconsistency of the 1966 relationship with 
the pattern of the previous 2 years. 


Discussion 


The results are consistent with what would 
be expected theoretically. The performance 
of the subjects on the I-E scale is consistent 
with their performance on a variety of other 
self-report devices. The practical importance 
of this finding is that, to some degree, in- 
ferences as to internality may be made on 
the basis of inspection of other instruments, 
such as ACL or CPI profiles. 

The data also suggest that the previously 
stated theoretical formulation of I-E may be 
too simplistic. Individuals scoring low on the 
LE scale (internals) are more homogeneous 
on their test performances than are high 
scoring subjects. This may suggest a diversity 
in the psychological meaning of externality. 
For example, one may be an external indi- 
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vidual because he is in fact physically or 
intellectually weak in relation to those around 
him. On the other hand, a person may de- 
scribe himself as an external because he is in 
a highly competitive social situation, where 
the actions of others may have great rele- 
vance for the success of his own efforts. Both 
of these orientations may be described as 
simultaneously realistic and pessimistic, yet 
there are other possible conditions that could 
be antecedents to an external orientation. If 
a person believes in luck or fate, and if he 
further believes that these external forces are 
on his side, he may accurately describe him- 
self as an external. Further, a person may 
develop feelings of persecution, with or with- 
out reason. Both of these orientations would 
be described as relatively unrealistic, while 
the former would be optimistic and the latter 
pessimistic, These various possibilities are 
consistent with the findings of diffuseness in 
the self-descriptions of externals. 

Internal individuals are likely to describe 
themselves as active, striving, achieving, pow- 
erful, independent, and effective. Externals 
might possibly also describe themselves in 
this way, but they will more likely describe 
themselves in somewhat opposite terms. Cer- 
tainly the data in this report support the 
conclusion that internality is consistently as- 
sociated with indexes of social adjustment 
and personal achievement. It is suggested, 
however, that theoretical and empirical dif- 
ferentiation of the notion of externality would 
more sharply define this relationship. 

It is also possible that the utility of I-E 
for behavioral prediction would be increased 
if externality were to be differentiated along 
the lines indicated. At the very least, an at- 
tempt should be made to assess the extent to 
which a self-description of externality has 
prima facie veridicality and to assess the 
extent to which a person considers external 
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forces to be benevolent as opposed to malevo- 
lent. Were this to be done, the complexity of 
the relation of maladjustment to I-E (Rotter, 
1966) might be resolved. The present data 
make it clear that this relationship is worth 
exploring. 
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INFLUENCE OF THERAPEUTIC TECHNIQUES ON COLLEGE 


STUDENTS’ PERCEPTIONS OF THERAPISTS* 


GLENN E. SNELBECKER 


Veterans Administration Hospital, Brockton, Massachusetts 


Using an experimental analogue, this study compared college students’ Relation- 
ship Inventory (RI) perceptions of therapists as a function of directive/ 
nondirective therapist behavior. Results supported the hypothesis that non- 
directive therapist behavior elicits more favorable perceptions by college stu- 
dents in terms of empathic understanding, level of regard, and uncondition- 
ality of regard. No differences were obtained on congruency, the dimension 
which some client-centered therapists consider to be the most important. The 
2nd therapist observed was rated significantly higher than the Ist, indicating 
that order effects should be taken into account in evaluating patients’ percep- 
tions of therapists when repeated measures are used. A sex effect was obtained 
on 3 RI dimensions, and there was the suggestion that females may be more 


susceptible to order effect than males. 


Almost a decade ago Rogers (1957) identi- 
fied six conditions which he held to be neces- 
sary and sufficient for achieving constructive 
personality change in psychotherapy. Among 
these six were three characteristics of the 
therapist: (a) that he be congruent, or inte- 
grated, in the relationship, (b) that he ex- 
perience unconditional positive regard for the 
client, and (c) that he have empathic under- 
standing of the client’s internal frame of 
reference and endeavor to communicate this 
understanding to the client. Rogers’ sixth con- 
dition emphasized the importance of com- 
municating such therapist attitudes to the 
patient. Seeman (1965) commented on the 
contemporary relevance of these three basic 
characteristics of therapists and designated 
congruence as the most important basic pre- 
condition of effective therapy. Some empirical 
support has been obtained as to the im- 
portance of these therapist attributes by Bar- 
tett-Lennard (1962), Sundland ( 1960), and 
van der Veen (1965), in that patients’ prog- 
ress in therapy has been found to be related 
to their perceptions of their therapists along 
one or more of these dimensions. An instru- 
ment called the Relationship Inventory (RI; 
Barrett-Lennard, 1962 ) has been used in 


„+ These data were collected as part of the author’s 
dissertation at Cornell University. The author grate- 
fully acknowledges the assistance of Lars P. Peter- 
son, A. Gordon Nelson, Henry N. Ricciuti, Mur- 
ray A. Straus, Eugene L. Schwaab, Jr., Frederick 
Heilizer, and Henry S. G. Cutter. 
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these studies to assess patients’ perceptions 
of their therapists along the dimensions de- 
lineated by Rogers (1957), with the addi- 
tion that Barrett-Lennard divided Rogers’ 
second therapist characteristic into two con- 
cepts—level of regard and unconditionality of 
regard. Thus the RI assesses perceptions along 
four dimensions: empathic understanding, 
level of regard, unconditionality of regard, 
and congruency. 

Having found these perceptions of ther- 
apists’ attitudes to be related to progress in 
therapy, one next step is to delineate the be- 
haviors of the therapist which seem effective 
in communicating such attitudes to patients. 
Using an experimental analogue, the purpose 
of the present study was to investigate 
whether directive or nondirective therapist 
behavior tends to elicit more favorable per- 
ceptions of therapists along these dimensions. 
The directive-nondirective continuum was 
selected because of its earlier identification 
with the client-centered approach (Rogers, 
1942) and the controversy which resulted 
when other therapists indicated that they, too, 
use nondirective techniques under certain con- 
ditions (e.g., Seeman, 1965). Studies focusing 
on psychotherapy patients’ and vocational 
counseling clients’ preferences for therapists 
along this continuum have produced generally 
contradictory results (Ashby, Ford, Guerney, 
& Guerney, 1957; Barahal, Brammar, & Sho- 
strom, 1950; Forgy & Black, 1954; Grigg & 
Goodstein, 1957; Sonne & Goldman, 1957). 


Paz 
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METHOD 


The general strategy in this study was to com- 
pare Ss’ RI perceptions of a directive therapist and 
a nondirective therapist after they had seen sound- 
film recordings of each therapist in an actual psycho- 
therapy session. All Ss were shown both therapists 
so that they could serve as own-controls in terms of 
absolute levels of the ratings. A counterbalanced 
design was used to control for the possible in- 
fluence of Ss’ perceptions of one therapist on their 
perceptions of the other therapist; half of the Ss 
(84 males, 35 females) were shown the nondirective 
therapist first, while the other half saw the direc- 
tive therapist first. All Ss used the RI to rate 
the therapist immediately after viewing each film. 


Subjects 


The Ss were 238 students in an introductory gen- 
eral psychology course (168 males and 70 females) ; 
they participated in several research projects as 
an integral part of their course. Male Ss ranged 
in age 17-35, with a median age of 18 years, 7 
months; female Ss ranged in age 16-22, with a 
median age of 17 years, 10.5 months. Most Ss (121 
males, 58 females) were college freshmen or sopho- 
mores, while the others were more advanced under- 
graduates. 


Therapist Behavior 


The type of therapist behavior was manipulated 
by defining two families of therapist verbal re- 
sponses, directive and nondirective, and presenting 
to Ss a sound-film recording of a therapist typical 
of each type. The two films? described in this study 
were used because descriptions and reviews indi- 
cated that they depicted substantial differences in 
therapist behavior along the directive-nondirective 
continuum. 

Nondirective therapist behavior (NTB). This 
family of therapist verbal responses included: re- 
flection of feeling, restatement of content, nondirec- 
tive leads, nondirective structuring responses, and 
frequent use of silence when the patient offers no 
verbal response. The therapist lets the patient do 
most of the talking and suggesting of topics. The 
behavior of the therapist is such that it is the 
responsibility of the patient to initiate and to direct 
the course of the verbal interaction during the 
psychotherapy session. 

Though therapists of various persuasions make 
use of such procedures or techniques under some 
conditions or with certain patients, client-centered 
therapists have been noted in particular for the 
extent to which such procedures constitute a basic 
part of their theory and method of therapy (See- 
man, 1965). Since Rogers frequently has been char- 
acterized as a major spokesman for this latter 
group, the film selected to present NTB was one 


2The first 20 minutes of each film was used, 
omitting introductory comments about the film; 
superimposed titles were blocked so that they were 


not seen on the screen. 


in which Rogers meets with one of his patients in 
an initial interview: Psychotherapy Begins: The Case 
of Mr. Lin (Psychological Cinema Register, 1965). 

Directive therapist behavior (DTB). The second 
family of therapist verbal responses included: direc- 
tive leads, interpretations, directive structuring, 
questioning (i.e. probing questions), approval, en- 
couragement, suggestions, and advice. The therapist 
generally attempts to initiate and to direct such 
verbal interaction during the therapy session as will 
assist the patient in the resolution of his problems. 

A film was used in which the therapist’s behavior 
was generally typical of these verbal responses: 
Psychotherapeutic Interviewing Series: Part II. A 
Method of Procedure (Veterans Administration, 
1956). One reviewer (Association of American Medi- 
cal Colleges, 1953) depicted the therapist in this 
film as being so directive that student psychother- 
apists viewing the film may mistakenly conclude 
that the therapist always knows, rather than merely 
tries to know, from the outset which of the patient’s 
comments are most important. 


Perceptions of Therapists 


The Relationship Inventory, Form OS-M (Barrett- 
Lennard, 1962), assesses patients’ perceptions of 
therapists along four dimensions: (a) empathic 
understanding—the extent to which the therapist 
communicates immediate awareness of and under- 
standing for his patient; (b) level of regard—the 
composite positive and negative feelings displayed 
by the therapist for his patient; (c) unconditionality 
of regard—the constancy of affective response, or 
its lack, regardless of the general level of regard 
expressed by the therapist for his patient; and 
(d) congruency—the extent to which there is unity 
in the therapist’s covert and overt feelings for his 
patient. 

The RI consists of 71 statements, 17 for empathic 
understanding and 18 each for level of regard, 
unconditionality of regard, and congruency, Each 
statement is scored according to a 6-point scale, 
in which +3 means strong agreement with the 
statement and —3 means equally strong disagree- 
ment. Since questions which positively evaluate the 
relationship are not always stated positively, and 
vice versa, a conversion is made during scoring so 
that a continuum from +213 to —213 is achieved. 
According to this system +213 means that the patient 
evaluates his therapist as highly as possible along 
all dimensions, while —213 would, of course, mean 
just the opposite. The score for each dimension is 
obtained by computing the algebraic sum for the 
respondent’s answers to the items for the respec- 
tive dimensions. Thus, scores for each dimension 
can range from +54 to —54 (+51 to —51, for 
empathic understanding). 

Barrett-Lennard’s (1962) form (Form OS-M) 
contains statements such as “He respects me,” “He 
tries to understand just how I see things.” For the 
present project, it was necessary to revise these 
statements for mnonparticipant observers (Form 
OS-M-C), so that they read “The therapist respects 
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TABLE 1 


ADJUSTED SPLIT-HALF RELIABILITY COEFFICIENTS FOR 
Two FORMS OF THE RELATIONSHIP 


INVENTORY 
Form OS-M-C* 
Person-perception Form 
dimension RACE Directive] OS-M° 
irective P 
therapist fperonisi 
Regard .87 .94 93 
Empathic understanding 84 .76 78 
Unconditionality 82 85 83 
Congruence 84 75 89 
aN = 66. 
bN = 42. 


his patient,” “The therapist tries to understand just 
how his patient sees things,” Similar changes were 
made in the other RI statements. 

To compare the OS-M form and the OS-M-C 
form, data from 66 randomly selected Ss from the 
present sample were analyzed; split-half reliability 
coefficients were computed for the latter form and 
were compared with similar coefficients reported by 
Barrett-Lennard (1962). As can be seen in Table 1, 
Form OS-M-C appears to be quite similar to Form 
OS-M in terms of this internal consistency measure 
of reliability, 


RESULTS 


Table 2 summarizes data for each of the 
four RI dimensions (empathic understanding, 
level of regard, unconditionality of regard, 
and congruency) as a function of therapeutic 
technique, the order of presentation of ther- 
apeutic techniques (seen before or seen after 
the other therapist), and sex of the observer- 
subject. A 2 xX 2 x 2 analysis of variance with 


one repeated measure (Winer, 1962) was con- 
ducted for each of these dimensions, and the 
results are summarized in Table 3. 

The main question investigated in this 
study was whether therapeutic technique in- 
fluences students’ perceptions of therapists. 
These analyses of variance indicate that the 
nondirective therapist was given more favor- 
able ratings on empathic understanding, level 
of regard, and unconditionality of regard 
(b < .01). When one compares the data for 
the RI dimensions, it can be seen that it 
was the unusually low mean square for thera- 
peutic techniques which led to nonsignificant 
results on the congruency dimension, rather 
than an unusually high error term. 

A significant order effect was obtained on 
all four RI dimensions (p < .01). By inspec- 
tion of the means (Table 2), it is evident 
that the therapist presented second was 
rated higher than the one presented first. 
Since the interaction between therapeutic 
technique and presentation order was not 
statistically significant, there is evidence that 
this order effect occurred essentially equally 
with the two therapeutic techniques studied. 

The therapists were rated more favorably 
by female subjects than by male subjects on 
three of the RI dimensions—empathic under- 
standing, unconditionality of regard, and 
congruency ($ < .01). There was a significant 
interaction between sex of observer and pre- 
sentation order on congruency ($ < .05). By 
inspection of the data (Table 2) it can be 
seen that both males and females gave higher 
ratings for the therapist seen second; how- 


TABLE 2 


MEANS FOR RELATIONSHIP INVENTORY DIMENSIONS AS A FUN 


[CTION OF THERAPEUTIC TECHNIQUE, 


PRESENTATION ORDER, AND SEX OF OBSERVER 


Empathic L Unconditionalit; 
f evel of regard HFA Congruency 
Meckntgue understanding of regard 
Male Female Male Female Male Female Male Female 

Nondirective 

Seen first 10.8 13.4 17.1 15.7 10.2 16.5 8.9 6.1 

Seen second 17.2 18.8 25.7 25.7 16.3 20.2 13.5 14.1 

Directive 

Seen first 5.4 8.5 10.8 11.6 5.0 10.7 5:3 5.7 

Seen second 6.7 15.4 14.9 20.7 79 13.8 11.1 18.2 
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TABLE 3 
ANALYSIS OF VARIANCE FOR RELATIONSHIP INVENTORY DIMENSIONS 
Empathic Unconditionalit 
> Level of regard A C 
Source of variation df E pr regan pase 
MS F MS F MS F MS F 
Between Ss 237 
Sex of Ss (S) 1} 392.71| 8.16** 41.67 — 738.31 | 13.44** 
; i A A K ¥ 42.75 4,70* 
Order of presentation (O) 1} 614,88 | 12.79** | 1,590.13 | 22.79** | 510.61 | 9.30** | 1,334.47 | 146.50** 
sxo att’ 1 32.75 — 61.38 = 6.85 — 154.56 | 16.98** 
Error: Ss within groups |234| 4809| — 6978| — 5493| — 910) — 
Within Ss 238 

Therapeutic technique (T)| 1 | 1,388.57 | 22.20** | 1,604.79 | 16.62** | 1,294.60 | 26.33** 70.36} — 
SXT 1 91.57 — 98.95 — 3.24 — 145,28 = 
OXT 1 75.06] — 85.72 | — 42.95} — 37.16 = 
SXOXT 1 67.34 — 19.41 cond 11.05 = 15.30 = 
T X Ss within groups 234 62.55} — 96.53| — 49.17 | — | 99.35 — 

*p <05 

=p S01 


ever, on the congruency dimension, females 
showed a greater increase than did males. 


DISCUSSION 


These data are compatible with the hypo- 
thesis that nondirective therapist behavior in- 
fluences college students toward perceiving a 
therapist as being more understanding, having 
a higher level of regard for his patient, and 
being less conditional in his regard for his pa- 
tient than does directive therapist behavior. 
These results are consistent with the preference 
for nondirective therapists reported by Barahal 
et al, (1950), but tend to contradict results 
from those studies in which subjects indicated 
that they prefer directive therapist behavior 
(Forgy & Black, 1954; Grigg & Goodstein, 
1957; Sonne & Goldman, 1957). It is of 
interest that nondirective therapist behavior 
seems to elicit favorable perceptions along 
three of these dimensions, since they have 
been found to be related to patients’ progress 
in therapy. Somewhat surprising is the find- 
ing that differences in behavior along the 
directive-nondirective continuum do not seem 
to influence perceptions of congruency, the 
dimension which some client-centered ther- 
apists consider to be the most important of 
the four. If this finding is confirmed in future 
research, it conceivably could indicate need 
for reevaluation of the importance of this di- 


mension or need for identifying therapeutic 
techniques which facilitate communication of 
congruency. 

Quite unexpected was the finding that the 
second therapist observed was rated signif- 
icantly higher than the first. In a previous 
study involving presentation of sound record- 
ings to high school students (Sonne & Gold- 
man, 1957), no statistically significant order 
effect was found. In contrast, the present data 
suggest that observing and rating a ther- 
apist may influence the reported perceptions 
of a second therapist and, perhaps, may in- 
fluence subsequent perceptions of the same 
therapist. There is also partial support for 
the notion that females may be more suscep- 
tible to such an effect than males. Thus, order 
effects should be taken into consideration in 
evaluating patients’ perceptions of therapists 
where repeated measures are used. Since many 
studies of patients and therapists involve sev- 
eral ratings over time, it would seem impor- 
tant to determine whether findings which are 
obtained may represent an order effect which 
is independent of the variables being studied. 

There was considerable divergence of per- 
ceptions among subjects along each of the 
four RI dimensions. On three of these, female 
subjects reported more favorable perceptions 
of the therapists than did male subjects. This 
would raise questions concerning other indi- 
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vidual differences which might influence col- 
lege students’ perceptions of therapists or 
the bases on which they report such percep- 
tions, For example, there is at least the pos- 
sibility that there may be systematic rela- 


tionships between observers’ personality 
characteristics and their perceptions of 
therapists. 


That these data were obtained in an ex- 
perimental analogue sets obvious limitations 
as to their implications for actual psycho- 
therapy sessions. However, at the least they 
may suggest questions for further study with 
actual patients and their therapists. 
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CROSS-VALIDATION OF THE HALSTEAD-REITAN TESTS 


FOR BRAIN DAMAGE? 


ARTHUR VEGA, Jr, anD OSCAR A. PARSONS 
Medical Center, University of Oklahoma 


The Halstead-Reitan test battery offers considerable promise as a standardized 
research and clinical technique for studying behavioral deficits associated with 
brain damage. With the goals of cross-validation and of providing usable 
normative data from a population in a different geographic setting, the tests 
were administered to 50 brain-damaged and 50 control Ss, The brain-damaged 
patients performed more poorly than the controls (p < .001) on all but 1 test. 
The general level of performance was lower than that previously reported. 
Overall performance of the control Ss was significantly related to age and 
educational level, with correlations of —.57 and .56, respectively. Use of 
Halstead’s Impairment Index resulted in a 73% correct classification of both 
groups using a cutoff score of .7. Scores were transformed to allow intra- and 
interindividual comparisons of level of performance, A modified index resulted 


in 79% correct classification. 


The psychological evaluation of patients 
with suspected or known brain damage is a 
pressing clinical problem. It is not infrequent 
in such patients that deficits in psychological 
functions of a fairly subtle nature are the 
first to appear but may be missed in the 
clinical neurological examination. In cases 
where the diagnosis is well established, the 
delineation of areas of defective and intact 
abilities can contribute to rehabilitative 
efforts and to our as yet inadequate knowl- 
edge of brain-behavior relationships. Since 
the pioneering work of Goldstein (1942) and 
Goldstein and Scheerer (1941), countless 
studies have attempted to describe and 
quantify impairment of functions in brain- 
damaged individuals. The findings have 
consistently indicated that brain-damaged 
individuals perform more poorly than non- 
brain-damaged individuals on a wide variety 
of tasks (Reitan, 1962). However, the fact 
remains that there are few discriminating 
psychological tests which have withstood the 
test of cross-validation over different geo- 


1 This research was supported by Public Health 
Service Research Grant No. NB-05359 from the 
National Institute of Neurological Diseases and 
Blindness. The authors wish to express their grati- 
tude to Julian Burn, Marilyn Gasswint, and Sarah 
Yourshaw for their assistance in all phases of the 
study. J. Paul Costiloe of the University of Okla- 
homa Medical Center Computer Facility prepared 
the major statistical calculations. 


graphical regions, socioeconomic classes, and 
pathological groups. 

There are a number of possible reasons for 
the generally disappointing findings in this 
field. The majority of psychological tests for 
brain damage today suffer from inadequate 
standardization, either because the test data 
are not adequately quantifiable or because of 
a lack of application to populations suffi- 
ciently varied in diagnostic, cultural, and 
socioeconomic classifications. It is no accident 
that the best of our testing instruments, such 
as the Stanford Binet and Wechsler Adult 
Intelligence Scale, are those which have been 
standardized on representative samples from 
the general population of the United States. 

The requirements for the most useful psy- 
chological technique in this area of investiga- 
tion are well recognized: (a) the technique 
should provide a maximum degree of dis- 
crimination between patients with brain 
dysfunction and those with normally func- 
tioning central nervous systems; (b) the 
technique should assess a range of abilities 
and thus include a number of tests; (c) the 
performance of the subjects should be capa- 
ble of quantification; and (d) the effects of 
factors such as sex, race, and, particularly, 
age and education on the test scores should 
be known and appropriately weighted where 
indicated. 

There is only one test battery which even 
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partially meets these requirements. For over 
20 years, Halstead and his colleagues have 
conducted a long-term investigation of a 
battery of tests specifically developed to 
measure the effects of brain pathology upon 
human adaptive functioning (Halstead, 1947; 
Shure & Halstead, 1958). Subsequently, well 
over 2,000 patients with documented brain 
dysfunction have been tested on this battery 
of tests in the laboratories of one of Hal- 
stead’s colleagues (Reitan, 1955b, 1962), 
producing evidence for a satisfactory cross- 
validation. Further evidence is available not 
only that prediction of brain damage can be 
made with a considerable degree of accuracy, 
but also that patterns of decrement in the 
abilities measured by these tests often are 
characteristic of the particular nature and 
location of the brain dysfunction or lesion 
(Reitan, 1955a; Reitan & Tarshes, 1959). 
This is a remarkable accrual of data, and, if 
substantiated in other laboratories, it may 
well be the solid core of reference data so 
urgently needed in both clinical and research 
settings. However, there are several questions 
regarding the methods by which the data are 
customarily analyzed and presented which 
must be settled before the norms would be 
of maximal use to investigators using these 
tests in other laboratories. 

The first and most crucial question is con- 
cerned with the manner of specification of 
the level of performance on the tests and the 
extent to which these scores can be expected 
to shift as a function of differences in the 
populations under investigation. This two- 
fold question cannot be adequately answered 
by reference to the current literature on the 
Halstead tests. Due, perhaps, to the research 
goals of the majority of these studies, the 
test scores of the subjects have generally been 
presented either in terms of tables of signifi- 
cant £ ratios (Reed, Reitan, & Kldve, 1965 ) 
or as T-score conversions (Matthews, Shaw, 
& Kigve, 1966; Reitan, 1955b). The present 
investigators reviewed over 20 publications 
concerned with various combinations of tests 
in the Halstead-Reitan battery and found 
only 7 which included data as to mean scores, 
To find mean test scores of non-brain-dam- 
aged subjects is particularly difficult. 

In view of this lack of published standardi- 


zation data, it is not surprising that data re- 
garding the effects of age and education on 
level of performance on these tests are also 
not available. Clinical experience suggests 
that these variables play an important role 
in the interpretation of any individual score. 
Although published studies involving the tests 
in this battery have carefully matched or 
equated groups on the age and education 
variables (Matthews et al., 1966; Reitan, 
1955b), the effects of these variables have 
not been specified in a manner that would 
facilitate evaluation of the scores of any indi- 
vidual patient. Experience in our laboratories 
in Oklahoma in attempts at replication of 
published test results has underlined the pos- 
sible effects of differences in the cultural com- 
position of the populations as a factor to be 
considered in interpreting psychological test 
scores (Parsons, Maslow, & Stewart, 1963). 

The aim of the present study is to provide 
further evidence for the validity of the Hal- 
stead-Reitan battery, with particular empha- 
sis on the presentation and analysis of the 
test scores in terms of level of performance. 
It is felt that such a specification of perform- 
ance level and the variations attributable to 
age and educational differences is a starting 
point which cannot be bypassed if different 
investigators intend to use a common set of 
techniques to study brain-behavior relation- 
ships. 


METHOD 
Test Materials 


The tests under investigation, referred to as the 
Halstead-Reitan battery, include the following: 
Category test; Tactual Performance test, time, mem- 
ory, and location components; Rhythm subtest, 
Seashore Tests of Musical Talent; Speech-Sounds 
Perception test; Finger Oscillation test; Time Sense 
test, memory component. Detailed descriptions of 
these tests are to be found in Halstead’s (1947) 
book, as well as in several other publications (Rei- 
tan, 1955b; Shure & Halstead, 1958). In addition, 
the full Wechsler Adult Intelligence Scale (WAIS) 
was administered. The Halstead Flicker-Fusion test, 
part of the original battery, was not investigated in 
its original form, since subsequent work (Reitan, 
1955b) has suggested that its validity is question- 
able. A more reliable measure of visual deficit based 
on flicker perception has been devised in our labora- 
tories and is currently in use (Parsons, Chandler, 
Teed, & Haase, 1966; Parsons & Huse, 1958; Vega, 
Parsons, & Chandler, 1966). The tests were ad- 
ministered by trained assistants supervised by the 
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TABLE 1 


DIAGNOSTIC CLASSIFICATIONS OF BRAIN-DAMAGED AND 
CONTROL SUBJECTS 


Brain-damaged n Control n 
Cerebral vascular 14 | Peripheral neurological |28 
disease disorder & orthopedic 
Trauma 14 | Nonhospitalized 7 
Tumor 11 | Gastrointestinal 5 
Degenerative 7 | Psychoneurotic 5 
Infectious 2 | Cardiovascular 2 
Bilateral prefrontal | 1 | Respiratory 1 
lobotomy Diabetes 1 
Focal seizures, 1 | Thyroidism 1 

etiology unknown 


senior author, who had received training in Reitan’s 
laboratory regarding standard administration and 
scoring of the test battery. 


Subjects 


The Ss for the present study consisted primarily 
of patients from the University of Oklahoma Hos- 
pital and the Oklahoma City Veterans Administra- 
tion Hospital. The brain-damaged group consisted 
of 50 patients, all of whom had undergone a com- 
plete neurological examination. They were included 
in the sample only if they presented unequivocal 
evidence of brain damage, such as pathological find- 
ings on established diagnostic techniques. Further- 
more, only Ss who were functioning well enough to 
be administered the complete neuropsychological 
test battery were used. The contro] group consisted 
of 50 Ss, 43 of whom were patients hospitalized for 
various causes other than CNS dysfunction, and 7 
nonhospitalized subjects. All of these patients had 
undergone a complete physical examination, al- 
though not necessarily a neurological examination 
more detailed than that customarily given in a 
thorough physical examination. Any control S with 
questionable evidence of brain pathology was not 
included in the sample. A neurologist? completed 
detailed rating sheets for all Ss, in which such fac- 
tors as location and type of lesion were assessed. 
The neurologist’s mean rating of severity of damage 
was 2.8 on a 5-point scale, indicating a “moderate” 
degree of severity. The mean age of the brain- 
damaged group was 41.7 years (SD = 14.8), while 
that for control Ss was 40.8 (SD=13.1). Mean 
education was 10.2 years (SD =3.1) for the brain- 
damaged group and 11.1 (SD =3.2) for the con- 
trols. The means of the two groups did not differ 
significantly on the age and education variables. 
There were 44 males and 6 females in the brain- 
damaged group, while in the control group the dis- 
tribution was 37 and 13. Distribution of race for 
the brain-damaged group was 45 white, 4 Negro, and 


2Fay K. Myers and Gunter Haase provided their 
valuable services in this phase of the study. 
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1 Indian, and the comparable distribution in the 
control group was 43, 4, and 3. In Table 1 are the 
diagnostic classifications of the two groups. Con- 
cerning the general location of pathology in the 
brain-damaged group, 15 patients were classified as 
having right hemisphere damage only and 15 as 
having left hemisphere pathology only, The re- 
maining 20 patients were rated as exhibiting diffuse, 
bilateral, and/or subcortical brain damage. 


RESULTS 


Level of performance. The mean raw scores 
obtained by the two groups of subjects on the 
eight major variables are presented in Table 
2 along with the associated ¢ ratios. In addi- 
tion, the mean values for IQ and for Hal- 
stead’s Impairment Index are included. It 
can readily be seen that the brain-damaged 
group performed at a poorer level than the 
control group on all tests. The difference be- 
tween mean scores is highly significant (p < 
.001) for all but the Time Sense test. 

Effectiveness of Halstead’s criteria for “im- 
paired” performance. Although the brain- 
damaged group received a significantly higher 
Impairment Index than the control subjects, 
use of a cutting score of .6 and greater to 
classify a subject as brain-damaged resulted 
in an overly large number of controls (54% ) 
being misclassified. By use of a critical score 
of .7 and greater, a total correct classification 


TABLE 2 


MEAN Scores, STANDARD DEVIATIONS, AND | VALUES 
FOR BRAIN-DAMAGED AND CONTROL PATIENTS 
on HALSTEAD Test BATTERY AND 
WECHSLER ADULT INTELLI- 


GENCE SCALE 
aping | Control 
Test t 
mM | sp | M| sD 
Hal: 
AS 81.3 | 31.7 | 59.4 | 26.9 [3.73 
l performance 
ee ee 32.4 | 10.5 | 20.6 | 10.4 |5.62"** 
Memory 45 | 24| 66| 1.9 [4.88 
Location 14 | 48] 29] 22 Hoven 
ra 204} 4. 2 3. 
Sea 16.9 | 104 | 9:5 | 6.6 [azonet 
Tapping—dominant | 35.5 | 11.6 | 44 2 |á: 
Tipe aese- memory. |476.2_|527.2 [320.4 [302.6 |1.81 
Impairment index ‘83| 0120] 57] 0,28|5.40"* 
‘AIS 
Werba! 10 87.2 | 17.1 | 99.8 | 14.4 [3.81% 
Performance IQ 83.6 | 13.6 | 98.8 | 11.8 |593 
Full scale IQ 84.9 | 14.8 | 99.4 | 12.9 |523 


Note.—N = 100; 50 brain-damaged subjects, 50 control 


subjects. 
+p < .001, two-tailed test. 
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of brain-damaged and control patients of 
73% is obtained. Use of this cutting point, 
however, continues to produce a rather large 
proportion of “false positive” errors—34%. 

Conversion of raw scores to T scores. In 
order to enable intra- and interindividual 
comparisons of level of performance and to 
evaluate better the effects of age and educa- 
tion differences, the raw data were trans- 
formed in the following manner. The raw 
scores of the control subjects for each varia- 
ble were rank-ordered and converted to nor- 
malized T distributions with a mean of 50 
and a standard deviation of 10. Using the 
conversion tables thus obtained as representa- 
tive of “normal” (non-brain-damaged) per- 
formance, the scores of the brain-damaged 
group were converted to T scores. The T 
scores were constructed so that higher T 
scores represented more adequate performance 
on all measures. 

Revision of the impairment index. Con- 
sideration was given to the revision of the 
impairment index, a single score reflective of 
the level of performance on the entire battery 
of tests. Halstead’s Impairment Index repre- 
sents the number of test scores in the battery 
which fall below critical raw score values for 
each test. Use of this method enables classi- 
fication only into either “impaired” or “un- 
impaired” categories, since extremely poor 
or good performance on any of the tests 
cannot be reflected in this type of index. 
It is felt that a more adequate index of over- 
all level of performance could be created by 
examining the mean T score (across tests) 
for each subject. Using this technique it was 
found that use of a mean T score of less than 
46 to classify a subject as impaired resulted 
in a 76% correct classification of the brain- 
damaged group and 78% correct classification 
of the controls. The overall correct classifica- 
tion of 77% appears to be some improvement 
over that obtained with Halstead’s cutting 
points, particularly in the reduction of false 
positive errors, that is, labeling the per- 
formance of control subjects as impaired. At 
this point, one further attempt was made at 
refining the impairment index and the test 
battery itself. Clinical experience, data pub- 
lished by Reitan (1955b), and the present 
results (Table 2) suggested that the Time 
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TABLE 3 


CORRELATIONS OF HALSTEAD TEST SCORES 
WITH AGE AND EDUCATION 


Brain- 
Conta damaged 
Test 

Edu. Edu- 
Age cation | ASe | cation 
Category —.63%* | 45% | — 36%] 22 

Tactual performance 
Time —.28 | .33* |—.32*| .00 
Memory —45**| (26 | —'36*| 105 
Location —42%"! 11 | 731%] [09 
Rhythm —.33* | 58+ | [03 | 122 
Speech —40%*| (s8e* | —'25 | —101 
Tapping —34* | 33" | 111 129 
Time sense —104 | (02 10 | —.17 
Revised impairment index | —.57** | ‘56% | —'33*) [20 


Sense test contributed minimally to the 
identification of impaired performance, as 
well as being one of the more time-consuming 
tests in the battery. Accordingly, the mean 
T scores were reanalyzed excluding the 
scores on this test. The resulting differentia- 
tion, retaining 46 as the cutting score, resulted 
in an improved classification of brain-damaged 
and control subjects—80% correct identifica- 
tion of the brain-damaged subjects and 78% 
of the controls. 

Role of age and education. In Table 3 are 
presented the correlation coefficients of the 
test scores with age and education. In gen- 
eral, it can be seen that the correlation co- 
efficients are higher in the control group and 
that test scores are generally more highly 
correlated with age than with stated educa- 
tional level. In the brain-damaged group, no 
test score was significantly correlated with 
education, and no correlation with age reached 
the .01 level. Finally, correlation coefficients 
were calculated for the revised Impairment 
Index (excluding the Time Sense test) with 
age and educational level and are presented 
in Table 3. 


Discussion 


The results of current analysis have pro- 
duced evidence that the Halstead-Reitan tests 
provide a quantitative index of adaptive be- 
havior and that the presence of brain damage 
in a group of subjects results in highly sig- 
nificant decrements in performance in com- 
parison with the performance of a group of 
non-brain-damaged subjects. 
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Certain shifts were noted in the optimal 
cutting points of the scores, and it is felt 
that age differences account for much of the 
difference in the scores of the present sub- 
jects and those available in the literature. 
Comparison of raw scores with the few raw 
scores available in the literature, particularly 
in Halstead (1947) and Shure and Halstead 
(1958), indicates that the scores of both 
brain-damaged and control subjects were gen- 
erally somewhat lower (poorer) in the present 
samples. In view of the significant correlations 
of test performance with age noted in Table 
3, it is not too surprising that the subjects 
reported by these investigators performed 
better. The mean age of Halstead’s (1947) 
non-brain-damaged controls was about 28 
years (as contrasted to present mean age of 
40), while the mean age of the subjects in 
Shure and Halstead’s group of brain-damaged 
subjects was about 35. Similarly, the data for 
the non-brain-damaged control subjects used 
in the report by Chapman, Thetford, Berlin, 
Guthrie, and Wolff (1958) consisted of 
mean scores supplied by Halstead, the mean 
age of these subjects being about 20 years. 

A brief examination of the nature of some 
of the errors in identification underscores some 
of the problems encountered in studies 
attempting to identify presence or absence 
of brain damage in a population consisting 
mostly of hospitalized patients. Of the total 
of 100 patients, 11 controls and 12 brain- 
damaged patients were misclassified. Inspec- 
tion of the misclassified brain-damaged group 
revealed that their mean age was 12 years 
less than that of the misclassified controls 
and that as a group they had a mean Full 
Scale IQ 10 points higher than the misidenti- 
fied controls. The mean IQ of this latter group 
was below average (85), and one may justifi- 
ably assume that the premorbid IQ of the mis- 
classified brain-damaged group was probably 
above average, since, despite the presence of 
brain damage, the group mean IQ was 95. 
Further, at least three of the controls had 
been rated by the neurologist as exhibiting 
peripheral neural involvement, factors pos- 
sibly resulting in some performance decre- 
ment but purposely not excluded from the 


control group. 
Preliminary attempts with this group of 
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subjects to establish different cutting scores 
for the upper versus lower halves of the age 
and education distributions have not resulted 
in improved discrimination. Individual vari- 
ability in the control group and low correla- 
tions between these variables and test scores 
in the brain-damaged group, due no doubt 
to the great variability introduced by the 
damage, probably account for the lack of 
effectiveness of such attempts at age and 
education corrections in this relatively small 
population. Efforts will continue to obtain 
suitable numbers of control subjects through- 
out a wide age range to make age norms 
more feasible. 

Cultural factors may help account for the 
relatively lower level of performance of the 
present control group compared to that 
on which the original cutoff scores were 
developed. It is quite possible that sub- 
jects in the present study did not have 
quite the competitiveness on measures of 
intellectual functions which would be char- 
acteristic of more urban settings of the 
Midwest. Further, the fact that the con- 
trol group consisted primarily of patients hos- 
pitalized on a neurological ward for disorders 
other than brain damage is an important 
consideration when evaluating the “clinical” 
utility of the test battery. It was apparent 
that these control patients were sick and 
“malfunctioning” individuals, resembling to 
a considerable extent the “pseudoneurologic” 
group of Matthews et al. (1966). Finally, 
they were patients in hospitals in which their 
treatment was financed by public funds. It 
appears evident that any physical or psycho- 
logical factor which threatens the integrity 
and effectiveness of the individual impairs 
performance on these tests, which indeed 
seem to measure a person’s ability to cope 
with some rather basic problems in the en- 
vironment. Although it is important to study 
hospitalized control groups, further efforts 
must be made to obtain normative test data 
on healthy, nonhospitalized self-supporting 
individuals. Such data are needed in order to 
answer some of the more basic questions re- 
garding “normal” performance level and the 
various physical (CNS and peripheral) and 
emotional processes which result in impair- 
ment of effective or adaptive behavior. 
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Returning to the problem of identification 
of the brain-damaged subjects by means of 
an impairment index, it can be seen that use 
of the unmodified cutting scores results in an 
overall correct prediction rate of 73%, a fairly 
respectable rate. The rate of “correct” pre- 
diction obtained by the modified version of 
the impairment index (79%) is possibly an 
improvement, although not cross-validated, 
and is perhaps approaching the upper limits 
to be expected when answering the question 
of “yes”? versus “no” brain damage. In a 
recent study using the Halstead tests, Mat- 
thews et al. (1966) noted that even a com- 
posite measure such as the Impairment Index 
was relatively inefficient as a “yes” or “no” 
discriminator. They concluded that an analy- 
sis of subtle patterns of the test scores is 
needed when evaluating the presence of 
brain damage, rather than relying only on 
level of performance. Criticism of an unending 
search for the perfect brain-damage “screen- 
ing” battery is advanced in a recent review 
of the area by Spreen and Benton (1965). 
In a review of a number of studies of the 
predictive efficacy of tests for brain damage, 
they found that the reported cumulative pre- 
dictive value of several tests in a battery, 
when pooled, averages 80% correct, while in 
batteries of tests using specially weighted pre- 
dictive formulas an average of 83% correct 
predictions was reported. This is perhaps the 
maximum we can expect, considering such 
factors as the fallibility of the neurological di- 
agnosis (usually the ultimate criterion in 
studies of this nature which study nonsurgical 
human cases) and the widely diverse effects 
different types of brain pathology are known 
to have. The most profitable course of action 
would seem to be to pose more focal ques- 
tions about, for example, the effects of specific 
types of brain damage or other pathology 

` upon human performance, and to examine the 
processes underlying the observed decrements. 
The establishment of a nationally usable 
battery of performance measures is the first 
step in this plan of action. 

A few words are in order about the manner 
of presenting the test scores. Raw scores are 
reported in this study since it is felt that the 
T-converted scores, presented in the majority 
of studies of these tests, do not offer other 


users of the tests meaningful norms for com- 
parison. Conversion of raw scores to T scores 
is valuable for intertest comparisons within a 
given group. The “normal” score, based on 
the control group, for all tests is 50 and en- 
ables comparison of degree of deviation from a 
norm of a patient’s T scores whose raw scores 
would be difficult to compare with one another 
(e.g., minutes, errors, number of figures re- 
called). Such a transformation also avoids the 
rather misleading symmetry of graphs repre- 
senting performance of brain-damaged and 
control subjects when T scores were con- 
structed on the basis of the combined groups, 
as in Matthews et al. (1966, p. 258). Finally, 
the use of an impairment index consisting of 
the mean T score of all of a subject’s tests 
and which can therefore reflect all gradations 
of performance appears more in keeping with 
Halstead’s concept of a measure of “bio- 
logical intelligence” (Halstead, 1947; Shure 
& Halstead, 1958) than does a measure which 
merely reflects impaired versus unimpaired 
performance on a series of several tests. 
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DRAW-A-PERSON TEST AS A MEASURE OF INTELLIGENCE 
IN PRESCHOOL CHILDREN FROM VERY LOW 
INCOME FAMILIES 


LOIS-ELLIN DATTA + 
National Institute of Mental Health 


Ethnic group and sex did not affect either the IQ-equivalent scores or the 
congruent validities of the figure-drawing test (Goodenough-Harris scoring) in 
a nationally representative sample of 956 children enrolled in full-year Head 
Start programs. Higher correlations were obtained for performance (Caldwell- 
Soule Preschool Inventory) than for verbal (PPVT) measures and for older 
(4-7 yr.) than for younger children. The obtained coefficients (.3-.5) compare 
favorably with those previously reported for kindergarten and 1st-grade 
children from less impoverished families. Both PPVT and DAP mean IQ 
equivalents indicated, however, substantially lower performance for Head 


Start than for normative groups. 


Recent interest in special educational pro- 
grams has drawn attention to problems in 
measuring intellectual abilities and changes in 
performance in preschool children from low 
income families. The question of “culture- 
fairness” is thus added to the already consid- 
erable task of obtaining reliable measures at 
an age when the behavioral repertoire is lim- 
ited. 

The figure-drawing test has been widely 
used as a measure of intelligence in children 
(Sundberg, 1960). It is simple to administer 
and score and is considered to have predic- 
tive and congruent validity coefficients that, 
while relatively low, compare favorably with 
those reported for other standardized intelli- 
gence and achievement tests (Shipp & Lou- 
don, 1964; Vane & Kessler, 1964). Dennis 
(1966) concluded that Draw-A-Person per- 
formance reflects experience with representa- 
tional art rather than parental education or 
literacy. 

The availability of data from a nationwide 
sample of children enrolled in Project Head 
Start centers provided an opportunity to esti- 
mate the congruent validity of the figure- 
drawing test for younger children from very 
low income families. This was measured by 
comparing the results of the Draw-A-Person 


1This study used data collected by the Planning 
Research Corporation for the Office of Economic 
Opportunity under Contract No. OEO-1308, 1966. 
The author wishes to thank Ann Drake, for her 
assistance in data analysis, and Ruth Ann O'Keefe, 
for her contributions to every phase of the study. 


test (DAP) with the results obtained from 
the Peabody Picture Vocabulary Test 
(PPVT) and the Caldwell-Soule Preschool 
Inventory (PSI).* The PPVT (Dunn, 1965) 
is a widely used measure of verbal intelli- 
gence; the PSI has been developed as a “cul- 
ture-fair” measure of intelligence in preschool 
children. The DAP requires less equipment, 
administration time, and examiner training 
than does the PPVT. The PSI is similar to 
the WISC in terms of equipment, examiner 
training, administration and scoring time, and 
the apparent contribution of verbal and non- 
verbal skills to test performance. 

For the DAP, psychometrically desirable 
characteristics of a culture-fair test would 
include (a) a mean standard score of about 
100 and (b) correlations between the DAP 
and the PPVT and the DAP and the PSI at 
least similar in magnitude to validity coeffi- 
cients typically reported for the DAP (Har- 
ris, 1963). 


METHOD 


Seventy-two Project Head Start centers were se- 
lected to provide a sample representative of the 
population of 1966 full-year program centers in 
terms of geographic distribution and program length. 
From each center, 12-15 children were selected at 
random from an identification number list for in- 
clusion in the survey. 

The DAP, PPVT, and PSI were administered indi- 
vidually by college graduates with special training 


2B. Caldwell and D. Soule, The Preschool Inven- 
tory. Unpublished paper, Project Head Start, Office 
of Economic Opportunity, Contract S14, 1966. 
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in examining disadvantaged children. Sixty-five chil- 
dren, who were predominantly of Mexican origin, 
were tested in Spanish, Since the equivalence of the 
Spanish and English versions of the PPVT was not 
determined, data from children tested in Spanish 
were analyzed separately from data of children 
tested in English. 

Draw-A-Person (Machover, 1948) rather than 
Draw-A-Man instructions were used for the figure- 
drawing test. Data on sexual identification will 
be reported in a later paper. Bliss and Berger (1954) 
have concluded that the two forms of the test yield 
substantially the same results. Unless the drawing 
was identified as a woman by the child, ambiguous 
figures were scored by the Goodenough-Harris cri- 
teria for drawings of men (Harris, 1963). Of the 
956 drawings, 239 were not recognizable figures 
(Class A), 111 were scored by Draw-A-Woman 
criteria, and 606 were scored by Draw-A-Man cri- 
teria, Interrater reliabilities among the four scorers 
ranged .89-.99 for samples of 14-50 drawings. (For 
a detailed report of sampling, selection, and testing 
procedures, see Commins, Cort, Henderson, & 
O'Keefe, 1967.) 


RESULTS AND Discussion 
Mean Standard Scores 


The DAP and PPVT raw scores were con- 
verted to standard scores; the mean standard 
score at each age is set at 100 for the norma- 
tive samples for both tests. Table 1 shows 
that regardless of age, sex, or ethnic group, 
the average performance on both the DAP 
(overall mean standard score, 77.22) and the 
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PPVT (overall mean standard score, 82.02) 
was substantially lower than the mean for the 
normative samples. 

The low PPVT standard scores are con- 
sistent with the poor performance on verbal 
tasks frequently reported for children from 
lower class and minority group families 
(Deutsch, 1965). The low DAP standard 
scores were to some extent unexpected. Previ- 
ous studies have indicated that at least four 
relatively disadvantaged groups have achieved 
mean standard scores of about 100 on the 
DAP. Such means were reported for white 
and Negro kindergarten children in New 
York City public schools (Vane & Kessler, 
1964), for white and Negro 5-year-old chil- 
dren in a New York City day care center 
(Anastasi & D’Angelo, 1952), and for a 
representative sample of 300 Negro first- 
grade children from southeastern states (Ken- 
nedy & Lindner, 1964). Bowers and Giles 
(1966) found an increase in DAP scores as 
socioeconomic status increased among 6- to 
12-year-old children in Evanston, Illinois, but 
the mean DAP standard scores for the lowest 
socioeconomic groups, regardless of sex or age, 
were about 100. 

The norms for younger children on the 
DAP are not geographically and economically 
representative of the national population at 


TABLE 1 


DAP, PPVT, AND PSI CORRELATIONS By SEX, ETHNIC GROUP, AND AGE IN A SAMPLE OF 956 
PRESCHOOL CHILDREN ENROLLED IN Project HEAD START 


M standard score Raw-score correlation 
M 
Group N 
ace DAP/ | DAP/ | PPVT/ 
FEVE. DAR PPVT PSI PSI 
Total 956 60.3 82.02 77.22 46 56 3 
Sex, ethnic group> 
ees white P 188 63.3 89.26 76.57 40 Di 69 
Boys, Negro 273 56.4 80.64 76.02 +52 56 72 
Girls, white 166 621 | 95.18 | 78.14 ‘51 ‘60 hi 
Girls, Negro 264 58.2 79.32 1745 44 54 65 
Spanish-speaking 65 71.8 69.77 80.66 .53 -60 12 
b 
ae 72 44.1 80.01 76.76 38 39 48 
4 397 54.9 82.23 78.76 22 26 69 
5 335 63.8 84.32 74.21 31 44 62 
6 87 76.0 83.02 79.53 52 57 80. 
s In months. 


b Includes only c! 


hildren tested in English; too few children were tested in Spanish to compute data for sex and age subgroups. 
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these ages; the present sample differs from 
the normative samples and from the four 
cited lower income groups in geographic dis- 
tribution and in degree of economic depriva- 
tion, Eisenberg and Conners (1966) have 
reported a DAP mean standard score of about 
81 for 712 children entering Baltimore Head 
Start classes; of these children, 44% came 
from families with an annual income of less 
than $3,000, 30% were supported by welfare, 
64% of the fathers were unskilled, and about 
60% of both parents had less than a tenth- 
grade education. In the present sample, the 
mean annual income was $3,771 for a living 
group which averaged 6.7 persons, approxi- 
mately $560 per year per person. 

In comparison, the median educational 
level completed by the parents of the New 
York City children (Anastasi & D’Angelo, 
1952) was the eleventh grade, all of the 
mothers were employed, the median number 
of siblings was 1, and only 13% of the fath- 
ers were unskilled. The Baltimore children 
thus appear to differ from the New York City 
children primarily in the severity of economic 
deprivation; they appear to differ from the 
present sample primarily in geographic loca- 
tion and urban/rural distribution. Although 
the data do not permit estimations of the 
independent contributions of income, city size, 
geographic area, or of their interactions, fac- 
tors associated with very low income rather 
than factors associated with geographic or 
urban/rural distributions seem to be responsi- 
ble for the differences in DAP performance 
between Head Start children and the previ- 
ously reported samples. 

The low mean DAP standard scores indi- 
cate that, with the present norms, the test 
would not provide a “culture-fair” measure 
of individual attainment in an economically 
heterogeneous group. Within the Head Start 
sample, however, DAP standard scores were 
not affected by ethnic or sex differences, 
while the PPVT standard scores were affected 
by factors associated with both sex and eth- 
nic group. Results of a 2 x 2 unweighted 
means analysis of variance (Winer, 1962) 
indicated that the DAP standard scores of 
girls and boys and of Negro and white chil- 
dren did not differ significantly. Ethnic group 
and sex F ratios significant at S.01 were 


found for PPVT standard scores: The PPVT 
performance of white children was higher than 
the performance of Negro children (F = 
31.43, p = .001), and boys, regardless of eth- 
nic group, achieved higher PPVT scores than 
did girls (F = 4.37, pS.01). The PPVT 
mean standard score for Spanish-speaking 
children (69.77) was significantly lower than 
the PPVT means for other groups; the DAP 
mean standard score for Spanish-speaking 
children (80.66) did not differ significantly 
from the DAP mean scores for other groups. 
The DAP may thus be relatively insensitive 
to factors affecting the PPVT scores, and 
among these factors may be those related to 
cultural influences, 

It is difficult to estimate the extent to 
which the low mean standard scores on both 
the DAP and the PPVT are due to cognitive 
as contrasted to emotional or motivational 
associates of deprivation. Some evidence of 
the importance of cognitive factors may be 
found in the report that culturally deprived 
children were not reliably lower on all mea- 
sured aspects of psycholinguistic functioning, 
but were primarily handicapped in the areas 
of auditory word comprehension and auditory 
vocal automatic decoding (Barrett, Semmel, 
& Weener, 1965). On the other hand, rela- 
tively minor changes in testing conditions 
have been associated with substantial im- 
provement in performance (Riessman, 1962). 
Despite agreement on the importance of op- 
timum testing conditions, there have been 
few systematic studies comparing directive 
(“Think again; you can do better than 
that”), standard-neutral, and supportive atti- 
tudes for deprived and privileged preschool 
and elementary children. 


Congruent Validity 


The raw-score product-moment correlations 
among the DAP, PPVT, and PSI shown in 
Table 1 are all significant at S.01; £ com- 
parisons among the correlations (7 to z trans- 
formations) indicated that age, sex, and eth- 
nic group did not significantly affect the 
congruent validity of the DAP. The corre- 
lations for all subgroups compare favorably 
with the .4 typically reported for groups of 
about 100 normal kindergarten and first- 
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grade children and with the 39 DAP/PPVT 
raw-score correlation obtained in a sample of 
5- to 6-year-old Head Start children (Eisen- 
berg & Conners, 1966). The DAP/PPVT 
standard-score product-moment correlations 
ranged .19-.52; these correlations, while 
lower than the raw-score correlations, were 
significant at S .01. The single exception was 
.22 obtained for 3-year-old children. 

Test bias has been defined (Educational 
Testing Service, 1966) as the consistent over- 
or underprediction of a criterion in one sub- 
group as compared to another subgroup, so 
that equally high predictive validity within 
subgroups would indicate a lack of bias. If 
this definition is extended to congruent valid- 
ity, the value of the DAP as an estimate of 
general intelligence appears to be as high 
among children between 4 years, 0 months 
and 6 years, 11 months from very low income 
families as it is among children of this age 
or slightly older from less economically de- 
prived backgrounds. 

Age and validity. The validity of the DAP 
for school children has previously been re- 
ported to decrease with age, being higher for 
children in kindergarten and the first grade 
than for children older than 9 years (Ellis, 
1953; Kennedy & Lindner, 1964; Pringle & 
Pickup, 1963; Vane & Kessler, 1964). As 
sample size increases, correlation magnitudes 
tend to decrease. If sample size is considered 
in this preschool sample, the DAP/PPVT 
raw-score correlations tend to increase with 
age. The lower congruent validity of the DAP 
for the younger children suggests that the 
value of the DAP as a measure of intelligence 
in children may be curvilinear with respect to 
age, increasing from 3-5 years and decreasing 
after about 8 years of age. 

Performance and verbal measures. The con- 
gruent validity of the DAP was higher for 
performance (.56) than for verbal (.45) 
abilities, regardless of age, sex, or ethnic 
group. Similar results have been reported by 
Pringle and Pickup (1963) and Harris 
(1959). These correlations were, however, 
considerably lower than the PPVT/PSI cor- 
relation of .73; the PSI would appear to have 
more reliable variance associated with a 
verbal than with a performance measure of 


intelligence. 
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SUMMARY AND CONCLUSIONS 


The DAP, the PPVT, and the PSI were 
administered to a nationally representative 
sample of 956 children attending 1966 Proj- 
ect Head Start full-year classes. Among chil- 
dren from 4 years, O months to 6 years, 11 
months, the DAP/PPVT and DAP/PSI cor- 
relations compared favorably with validity 
coefficients previously reported for children 
from less deprived homes. Among younger 
children, the congruent validity of the DAP 
was lower. Neither sex nor ethnic group sig- 
nificantly affected DAP correlations; the 
DAP thus meets one criterion of a culture- 
fair measure. 

On both the PPVT and the DAP, however, 
the mean standard scores were substantially 
lower than those reported for the norm 
groups. By the second criterion, the value of 
the DAP as a culture-fair measure of intelli- 
gence remains in question for children in sam- 
ples heterogeneous for socioeconomic status, 
although within this very low income sample, 
the DAP was less affected than the PPVT 
by factors associated with ethnic group and 
sex, 
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METHODOLOGICAL NOTE ON N ACHIEVEMENT AND 
FIELD INDEPENDENCE COMPARISONS 


CARL L. THORNTON anp GERALD V. BARRETT 


Goodyear Aerospace Corporation, Akron, Ohio 


The Embedded Figures Test (EFT) has been used as an index of field inde- 
pendence. However, Witkin and his associates originally found that EFT scores 
for women had a low correlation (.21) with their original perceptual tests. 
Since this is the case, all studies which have used female EFT scores as an 
index of field independence have obtained results which are not pertinent to 
Witkin’s concept. In particular, the comparison of n Ach and field independence 
is not possible when female EFT scores are used. 


McClelland and Witkin describe the experien- 
tial precursors of n Ach and field independence 
in very similar terms. McClelland, Atkinson, 
Clark, and Lowell (1953) stated that “mothers 
of sons with low n Ach tend to demand less in 
the way of independent achievement at an early 
age, and tend to be more restrictive than other 
mothers [p. 202].” Witkin, Dyke, Faterson, 
Goodenough, and Karp (1962) present the same 
kind of descriptions as important in producing 
field dependence. They tested and confirmed the 
hypothesis that “coercive” child-rearing prac- 
tices which stress authority and conformity 
would yield field-dependent children. 

The fact that these two groups of investi- 
gators have emphasized the same quality, “re- 
strictiveness” or lack thereof, has led two sets 
of investigators (Honigfeld & Spigel, 1960; 
Wertheim & Mednick, 1958) to compare n Ach 
and field independence. Wertheim and Mednick 
(1958) found a .40 correlation between n Ach, as 
measured by McClelland’s standard techniques, 
and field independence, as measured by the Em- 
bedded Figures Test (EFT). Honigfeld and 
Spigel (1960) noted that Wertheim’s sample 
contained a 3:1 ratio of women to men. To ac- 
count for possible sex differences, Honigfeld and 
Spigel tested equal numbers of men and women 
and made separate statistical comparisons. They 
found the same relationship between the EFT 
and n Ach (.40) for women; for men, however, 
a nonsignificant —.12 correlation was found. 
Honigfeld and Spigel (1960) stated, “the repli- 
cation confirmed the finding made by Wertheim 
and Mednick (1958) of a significant, positive 
relationship between n Ach and field independ- 
ence. However, there is evidence in the present 
study to suggest that this hypothesis is pertinent 
only to a female population [p, 551].” 


While these results may be important for psy- 
chological theory, the subsequent information 
presented in the present article shows that this 
interpretation is inaccurate, 

Witkin, Lewis, Hertzman, Machover, Meiss- 
ner, and Wapner (1954) originally used the fol- 
lowing three tests to obtain an index of field 
independence: the Rod and Frame Test (RFT), 
the Rotating Room Test, and the Tilting Room 
Test. The intercorrelations between the three 
tests ranged .18-.64. Clearly, the three tests are 
highly related, but still do not share a large per- 
centage of common variance (4%41%). Dur- 
ing the course of Witkin’s investigations, several 
other perceptual-motor tasks were compared to 
the original three tests. One of these tests (EFT) 
was found to correlate significantly with all three 
of the aforementioned tests. Since the RFT had 
the highest reliability and since the RFT and 
EFT correlated most highly together, these two 
tests have been used subsequently by most in- 
vestigators to determine “field independence.” 
The EFT has the widest popularity, since it is 
an easily administered inexpensive paper-and- 
pencil test. 

With this historical background in mind, it is 
important to reinspect the original correlations 
between the EFT and RFT obtained by Witkin 
et al. (1954). Table 1 shows these comparisons. 

Using the average results represented by the 
orientation index, it was noted that the correla- 
tion accounts for approximately 4% of the com- 
mon variance for women as compared to 41% 
for men. If the RFT is operationally de- 
fined as a measure of field independence, then 
it is clear that for women the EFT is a com- 
pletely unacceptable substitute test. Even for 
males it is quite possible that the two percep- 
tual tests are measuring two separate factors, 
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TABLE 1 
EFT CoRRELATED WITH RFT 
RFT RFT RFT | Orientation 
Series 1 | Series 2 | Series 3 index 
Test 


Mi) F}M|Fi MF |-Mip F 


EFT | .47 | .03 | 43 | .22 | .76 | .26 | .64 | .21 


since only 40% of the common variance is ac- 
counted for. 

These data obtained from Witkin’s studies 
should now be compared to the two studies re- 
lating n Ach to field independence. For males, 
the EFT is an approximate index of field inde- 
pendence, but is not found to be related to n 
Ach, For females, however, the EFT is not a 
measure of field independence, but is correlated 
with n Ach. 

It is evident from the analysis presented in 
this paper that all studies which have used fe- 
male EFT scores as an index of field independ- 
ence have obtained results which are not perti- 
nent to Witkin’s concept, (About 30 studies have 
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been reported in the literature which relate fe- 
male EFT scores to other measures. All these 
studies need to be reanalyzed in the light of the 
present methodological interpretation.) Actually, 
EFT scores for females may be more closely 
aligned to the McClelland et al. (1953) n Ach 
concept. 
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DIFFERENTIAL EFFECTS OF THERAPIST RACE AND SOCIAL 
CLASS UPON PATIENT DEPTH OF SELF-EXPLORATION IN 
THE INITIAL CLINICAL INTERVIEW * 


ROBERT R. CARKHUFF and RICHARD PIERCE 
State University of New York at Buffalo 


A Latin-square design incorporating whites and Negroes and upper and lower 
social classes was replicated across 4 groups of 4 hospitalized mental patients 
by 4 trained lay counselors. Randomly selected excerpts from the 64 recorded 
clinical interviews were rated on the depth of patient self-exploration. Race and 
social class of both patient and therapist were significant sources of effect, and 
the interaction between patient and therapist variables was significant. As 
patient depth of self-exploration during early clinical interviews has been highly 
correlated with outcome indexes of constructive patient change, the results have 
implications for counseling and psychotherapy. 


The effect of the social class of the counselor 
or therapist upon the therapeutic movement and 
outcome of the patient in therapy has been rec- 


1 This investigation was supported by University 
of Massachusetts Faculty Research Grant FR-J21 
to Robert R. Carkhuff. The authors wish to ac- 
knowledge the helpful assistance of Robert De- 
Burger, Eastern State Hospital, Lexington, Ken- 
tucky, in arranging for patient interviews. 


ognized in the psychological literature (Bernard, 
1963; Hollingshead & Redlich, 1958; Jessor, 
1956; Moore, Benedek, & Wallace, 1963). It has 
been suggested that lower class patients are not 
amenable to treatment by therapists who, by the 
usual criteria of education and vocation, are 
members of higher socioeconomic classes. The 
assumption is that the social class differences 
present a communication barrier which does not 
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allow facilitative interpersonal processes to take 
place. 

Similarly, racial effects upon outcome have 
been hypothesized in the areas of therapy 
(Shane, 1960), test examination (Canady, 1942; 
Shuey, 1958), and education (Silberman, 1964). 
Typically, the inhibiting effects of white coun- 
selors upon the responses of Negroes are noted. 

The effects upon patient process movement of 
each of these therapist characteristics may be 
accounted for by the other characteristic. Thus, 
the effects of social class may account for what 
have been presumed to be racial effects, and 
racial effects may, in some instances, have con- 
tributed to studies establishing social class as a 
significant source of effect. The example of the 
white upper class therapist offering treatment to 
the lower class Negro is a situation frequently 
replicated in our outpatient and inpatient serv- 
ices. 

The present study was designed to ferret out 
the differential effects of (a) the race and (b) 
the social class of the therapist upon patient 
depth of self-exploration, a critical index of 
patient therapeutic process involvement and a 
significant correlate of positive therapeutic out- 
come (Truax & Carkhuff, 1964). 


METHOD 


The four lay counselors, who had completed a men- 
tal health training program, were functioning at 
high levels of empathy, positive regard, and genu- 
ineness (Carkhuff & Truax, 1965a). In group ther- 
apy patients, these counselors had elicited a thera- 
peutic change significantly greater than that dem- 
onstrated by a control group of patients (Carkhuff 
& Truax, 1965b). The counselors were: (a) an upper 
class white, (b) an upper class Negro, (c) a lower 
class white, and (d) a lower class Negro. Thus, all 
lay counselors had (a) similar training, (b) similar 
kinds of therapeutic experience, and (c) had demon- 
strated no significant differences in their levels of 
counselor-offered conditions as measured by rating 
scales of empathy, positive regard, genuineness, and 
the depth of self-exploration elicited in patients in 
clinical interviews. A two-factor index of social class 
involving the educational and vocational (profes- 
sional or nonprofessional) level of the counselors and 
their spouses was employed for the counselors, while 
only educational level was considered for the in- 
patients, Study beyond the high school level was 
operationally defined as upper class for both coun- 
selors and patients. The upper class white counselor 
was 35 years of age and had completed 3 years of 
graduate study in education. While she was a vol- 
unteer worker at the time of this study, she had 
previously been employed as a teacher. The upper 
class Negro counselor was 38 years of age, had com- 
pleted 2 years of undergraduate college, and was 
employed as a physical therapist. The lower class 
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white counselor was 45 years of age, had completed 
high school, and was employed as a hospital at- 
tendant. The lower class Negro counselor was 50 
years old, had completed high school, and was em- 
ployed as a hospital attendant. All were female. The 
husbands of the upper class counselors were em- 
ployed in professional capacities, while the husbands 
of the lower class counselors were employed in 
semiskilled occupations. 

A Latin-square design was replicated across four 
different groups of four hospitalized mental pa- 
tients: (a) four upper class white patients, (b) 
four upper class Negro patients, (c) four lower class 
white patients, and (d) four lower class Negro pa- 
tients. All 16 patients were female with diagnoses of 
schizophrenia. The upper class white patients ave- 
raged 14.5 years of schooling; the upper class Ne- 
groes, 14 years; lower class whites, 9.75 years; and 
lower class Negroes, 9.5 years. Overall, the patients 
averaged 42 years of age and 4 years of institution- 
alization, with no significant differences between any 
of the four groups. All therapists and all patients 
were Southerners. 

Each counselor saw each patient in a design 
counterbalanced to control for the effects of order. 
In order to control for counselor fatigue factors, 
each group of four patients was seen 1 week apart. 
The patients rotated to the counselors’ rooms, and 
each 1-hour interview was recorded. All counselors 
began each session encouraging the patient in an 
open-ended fashion to discuss “whatever is impor- 
tant” to the patient “at this moment in time.” 

Six 4-minute excerpts were randomly selected from 
each of the 64 recorded clinical interviews and 
rated on the scale “depth of self-exploration in in- 
terpersonal processes” (Carkhuff, 1965) by two ex- 
perienced raters trained in rating patient self-ex- 
ploration. Pearson correlations yielded intrarater 
reliabilities over 1 week of .80 and .88 and an inter- 
rater reliability of .78. The five-point scale ranged 
from the lowest level, Level 1, where “the second 
person does not discuss personally relevant ma- 
terial, either because he has had no opportunity to 
do so, or because he is actively evading the discus- 
sion even when it is introduced by the first person,” 
to Level 5, where “the second person actively and 
spontaneously engages in an inward probing to 
newly discover feelings or experiences about him- 
self and his world.” 


RESULTS 


Race and social class of both patient and ther- 
apist were significant sources of effect, and the 
interaction between the patient and therapist 
variables was significant (see Table 1). Thus, 
both racial and class variables have an effect 
upon patient depth of self-exploration, and the 
effect of the patient variables is contingent upon 
the therapist variables. 

In general, the ratings of patient self-explora- 
tion ranged from Level 1 to Level 3, with an 
average of slightly under Level 2. Also, in gen- 
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TABLE 1 


ANALYSIS OF VARIANCE OF EFFECTS OF RACE AND 
Socar CLASS upon PATIENT DEPTH OF 
SELF-EXPLORATION 


Source of variance df F 
Between Ss ASET 
Patient 3 
Color (C) 1 8.07* 
Social class (S) 1 8.34* 
CxS 1 1.68 
Ss/Patient 12 
Within Ss 48 
Therapist 3 
Color(C) 1 7.20* 
Social class(S) 1 8.00** 
CxS 2 2.57 
Patient X Therapist 9 94.00** 
OPS (temporal) 3 2.05 
Residual 33 
Total 63 


Note.—Abbreviated: OPS = ordinal position within se- 
quence. 

*p <05. 

wD < 01. 


eral, the patients most similar to the race and 
social class of the counselor involved tended to 
explore themselves most, while patients most 
dissimilar tended to explore themselves least. 

There were no significant effects of the order 
in which the patients were seen, and the effects 
of race were not dependent upon the level of 
social class in both patient and therapist. 


Discussion 


Both race and social class of both patient and 
counselor appear to be significant sources of 
effect upon the depth of self-exploration of pa- 
tients in initial clinical interviews. As patient 
depth of self-exploration during early clinical in- 
terviews has been highly correlated with outcome 
indexes of constructive patient change (Truax & 
Carkhuff, 1964), the results have implications 
for counseling and psychotherapy, For example, 
increasingly larger numbers of lower class Ne- 
groes are represented in inpatient and outpatient 
treatment populations, but are very poorly rep- 
resented on the treatment staffs offering the serv- 
ices. If, indeed, the white upper class counselors 
and therapists who are readily available have an 
inhibiting effect upon the self-exploration of their 
lower class Negro patients, then the whole proc- 
ess of rehabilitation and improvement for lower 
class Negroes is limited. 


Notes AND COMMENTS 


The average level of self-exploration by the 
patients, Level 2, is described as follows: “[the 
patient] responds with discussion to the intro- 
duction of personally relevant material but does 
so in a mechanical manner and without the dem- 
onstration of emotional feeling.” The direct sug- 
gestion, then, is that the patients in the particular 
setting involved and under the conditions in- 
volved did not explore themselves at very deep 
levels during the initial interview. 

A final and important consideration is the fact 
that the patients and lay counselors were South- 
erners. The availability of trained lower class lay 
counselors, which made the study possible, dic- 
tated that the study be accomplished in their 
home community. The question of whether or 
not these differences would hold in other areas 
of the country has not been answered. 
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VALIDITY OF WISC SHORT FORMS AT THREE AGE LEVELS 


A. B. SILVERSTEIN 
Pacific State Hospital, Pomona, California 


Correlations with the Full Scale of all possible short forms of 2, 3, 4, and 5 
subtests were determined from the WISC standardization data- at 3 age 
levels: 73, 103, and 134 yr. The 10 best short forms of each length at each age 
level were considered, and information was given on their standard errors of 
estimate. A method which entails differential weighting of subtest scores 
rather than their simple summation did not result in appreciably higher 
validities. Data were also provided on the extent of agreement between the 
best short forms and the Full Scale in classifying individuals. 


Since the publication of the WISC (Wechsler, 
1949), a number of studies dealing with the 
validity of WISC short forms have appeared. 
Most of these have employed rather atypical 
samples, for example, mentally retarded or emo- 
tionally disturbed children, which tend to be 
either more homogeneous or more heterogeneous 
than the general population. Geuting (1959), 
following the lead of McNemar (1950) with the 
Wechsler-Bellevue and Maxwell (1957) with the 
WAIS, has investigated the validity of certain 
WISC short forms using the standardization data 
for the test, but her findings are not generally 
available. 

Both McNemar (1950) and Doppelt (1956) 
have provided formulas for the validity of a 
short form which do not require access to the 
raw data. Either one can be applied, for example, 
to the tables of WISC intercorrelations which 
Wechsler has presented for 200 children at each 
of three age levels: 74, 103, and 134 years, In 
the present study, Doppelt’s (1956) formula was 
used to determine the correlations with the Full 
Scale of all possible short forms of two, three, 
four, and five subtests.1 Since the standard devi- 
ation of each subtest is 3, the formula for a 
short form of k subtests simplifies to 3rj,/ 
Vk + 23r;;, where i and j identify the subtests, 
and ¢ the Full Scale. 


RESULTS AND DISCUSSION 


Table 1 shows the range of correlations with 
the Full Scale of the 10 best short forms of each 
length at each age level.2 For obvious reasons 
the correlations increase as the length of the 
short form increases. There is a tendency for the 
correlations at age 103 to be higher than those 


1Computer support from the Socio-Behavioral 
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at 133, which in turn tend to be higher than 
those at 73. These differences parallel Cohen’s 
(1959) findings concerning age differences in the 
proportion of the total WISC variance attribu- 
table to the general factor G. The present cor- 
relations are systematically lower than those 
given by Maxwell (1957) for the WAIS, which 
is consistent with Cohen’s observation that G 
accounts for a smaller proportion of the total 
variance of the WISC than of the WAIS. 
Information/Picture Arrangement, Informa- 
tion/Block Design, and Vocabulary/Block De- 
sign are among the best dyads at all three age 
levels. Information/Vocabulary/Object Assembly 
is similarly distinguished among the triads, and 
Information/ Comprehension/Arithmetic,/Picture 
Arrangement/Object Assembly among the pen- 
tads; no tetrad appears on all three lists. In 
evaluating these results, it should be noted that 
the arbitrary practice of considering only the 10 
best short forms of each length results in the 
selection of 22.2% of all possible dyads as com- 
pared to only 8.3%, 4.8%, and 4.0% of all pos- 
sible triads, tetrads, and pentads, respectively. 
As McNemar (1950) pointed out, the useful- 
ness of a short form depends on the accuracy 
with which it provides estimates of IQs on the 
Full Scale, At ages 103 and 134, the standard 
errors of estimate are approximately 7.0, 5.5, 
4.5, and 4.0 IQ points for the best dyads, triads, 
tetrads, and pentads, respectively. At age 74, the 


TABLE 1 


CORRELATIONS WITH FULL SCALE OF 
WISC Suort Forms 


Item Age 74 Age 10} Age 13} 
Dyads -807-.837 .872-.906 .858-.890 
Triads .881-.901 .925-.942 .917-.935 
Tetrads .919-.930 .948-.958 944.952 
Pentads -942-.947 -962-.968 .960-.966 
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TABLE 2 
CORRELATIONS WITH FULL SCALE OF WISC Snort Forms SELECTED BY THE 
WHERRY-DOOLITTLE METHOD 
Age 74 Age 10} Age 134 
Form R r Form R r Form R r 
VOA 839 .837 | VBD 919 -906 IOA 897 884 
IVOA 892 891 |AVBD 946 -938 ICOA 936 934 
ICVOA 924 919 |} A VPC BD 958 947 ICPAOA 958 .952 
ICAVOA .945 .939 | AVPCBDCO| .970 -960 ICAPAOA 969 964 


Note.—Abbreviated: V = Vocabulary; OA = Object Assembly; I = Information; C = Comprehension; A = Arithmetic; 
BD = Block Design; PC = Picture Completion; CO = Coding; PA = Picture Arrangement, 


standard errors are about 1.5 points higher for 
the best dyads and triads and about 1.0 point 
higher for the best tetrads and pentads. Of 
course, 1 in 20 estimates will involve errors 
twice as large as these. 

Several previous studies in this area have used 
the’ Wherry-Doolittle method (Dubois, 1957), 
which tends to select the very best short form of 
each length, but which entails the differential 
weighting of subtest scores rather than their 
simple summation. In the present study, this 
method was applied to the WISC standardiza- 
tion data. Table 2 shows the multiple correla- 
tion with the Full Scale of the best short form 
of each length at each age level. The correspond- 
ing simple correlation is also given for compari- 
son. The use of differential weighting does not 
result in appreciably higher validities in the 
present case. 

Finally, Mumpower (1964) has argued that 
the customary correlational measure of validity 
is not as meaningful as the agreement between 
short form and Full Scale in classifying indi- 
viduals, With the aid of tables issued by the 
United States Department of Commerce (1959), 
the agreement between the best short forms and 


TABLE 3 


AGREEMENT WITH FULL SCALE oF 
WISC Snort Forms 


Item Age 74 Age 10} Age 134 
Dyads 56.6% 65.5% 62.4% 
Triads 63.7% 71.1% 11.1% 
Tetrads 69.3% 75.6% 73.0% 
Pentads | 73.0% 78.6% 78.6% 


the Full Scale was estimated, using Wechsler’s 
(1949) seven-category classification system, from 
Very Superior to Defective. Table 3 summarizes 
the results. The best dyads misclassify more than 
one individual in three, and even for the best 
pentads the corresponding figure is one in five. 
Data on agreement may be used to supplement 
correlational data in evaluating the validity of 
short forms. 
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EFFECTS OF AGE AND INTELLIGENCE ON THE OPERATION 


OF SUPPRESSION 


IJA N. KORNER anp MARILYN M. BUCKWALTER 
University of Utah 


This.study was designed to investigate the relative contributions of age and 
intelligence to the ability to eliminate (“suppress”) a forbidden stimulus. 
Grade-school Ss at 3 age levels (M =8, 93, and 113) were given a TAT-like 
picture with a limited number of prominent stimulus objects and asked to 
tell stories which excluded 1 of the stimuli. Responses were classified according 
to process used in suppressing the forbidden stimulus. Intelligence, measured 
by scores on figure drawings, was uncorrelated with ability to suppress. De- 
velopmental differences, however, were apparent. Several of the youngest Ss 
were unable to comply with the task at all, and the younger Ss generally used 
more primitive suppressive methods than the older children. In addition, the 
number and refinement of suppressive techniques used by all grade-school Ss 


were more limited than those used by older Ss previously studied. 


Korner (1966) recently reiterated the diffi- 
culty of studying the so-called defense mecha- 
nisms in terms of speculative dynamics and sug- 
gested, instead, the value of investigating the 
processes or operations (mechanics) involved in 
their use. Focusing upon the mechanism of sup- 
pression, Korner gave subjects a TAT-like pic- 
ture containing a few prominent stimulus objects 
and asked them to tell stories which excluded one 
of the stimuli. Responses were classified accord- 
ing to processes used in eliminating (“suppress- 
ing”) the forbidden stimulus. 

A somewhat surprising finding was that a 
group of average high-school subjects and a 
group of highly intelligent graduate students 
used essentially the same limited number of 
rather primitive suppressive methods, A few of 
the older and more intelligent subjects, however, 
introduced some seemingly more refined opera- 
tions to accomplish successful suppression, and 
the older group as a whole appeared to have less 
difficulty in attempting the task at all. These 
findings suggested that most persons are in pos- 
session of the major methods of suppression by 
approximately age 16, but the relative contribu- 
tion of age and IQ to successful suppression was 
not clarified. The present study was designed to 
investigate both developmental and intellectual 
differences in the ability to suppress. 


PROCEDURE 


The experimental task was that previously used 
by Korner (1966): Each S was shown a large (8- 
by 12-inch) picture of a boy (with his back to the 
viewer) sitting on a rug and reading a book; in 
front of him are a banana and a basketball. The 
Ss were asked to “make up a story about this pic- 
Sure, but do not use a boy in your story.” Great 


care was taken to insure that Ss comprehended the 
instructions. 

Pilot studies indicated that almost no S younger 
than 7 years (Grade 2 in school) was able to com- 
ply with the task, and so testing was begun at this 
age. Three experimental groups were used, each con- 
taining an of 20: (a) second graders with a mean 
age of 8 years and a mean IQ of 103 (range, 88- 
170); (b) fourth graders with a mean age of 9% 
and a mean IQ of 102 (range, 78-138); and (c) 
sixth graders with a mean age of 114 and a mean 
IQ of 104 (range, 57-113). The IQ scores for all Ss 
were obtained from individual figure drawings 
scored by the Goodenough (1926) method. 

All stories told by Ss were analyzed by categor- 
izing the central figure(s) around whom the plots 
or themes developed. 


RESULTS 


As in the previous study, subjects’ comments 
indicated that the experimental task is more 
difficult than might be supposed. However, only 
5 of the 60 subjects were completely unable to 
attempt the task; these 5 were all second graders 
and the 5 youngest in their class. 

The remaining 55 subjects, regardless of age, 
all used one of two methods which were familiar 
from the earlier study. The only major quanti- 
tative difference between the present subjects 
and the older ones previously studied, in fact, 
was that the grade-school children used fewer of 
the possible operations. 

The first approach relied on by the grade- 
school subjects was the relatively primitive de- 
vice of simply relabeling the primary stimulus 
figure by changing his sex, species, or number 
and then giving a story very little different from 
one which might have been told about a boy. 
Eight of the 20 second graders (40%), 13 of the 
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fourth graders (65%), and 8 of the sixth graders 
(40%) used this approach. The total of 49% 
for the combined groups is probably equivalent 
to the 54% of the high school subjects who 
earlier depended on relabeling to accomplish sup- 
pression. 

The second method used by the younger sub- 
jects was that of shifting emphasis from the boy 
to one of the other stimuli in the picture and 
using the substitute as a new focus for the 
gestalt. This method was used by 9 of the second 
graders (60%), 7 of the fourth graders (35%), 
and 12 of the sixth graders (60%), as compared 
to 24% of the high school students and 20% of 
the graduate students. 

A developmental difference was suggested, 
however, by the fact that the technique of shift- 
ing emphasis may take two different forms, one 
seemingly more primitive than the other. The 
more crude device consists of simply naming or 
describing the other objects without organizing 
them into a story, thus actually failing to comply 
with the instructions. Most (78%) of the nine 
second graders relied on this cruder method, 
while none of the fourth graders and only two of 
the sixth graders did so. 

The shift of emphasis could also be accomp- 
lished by the slightly more refined method of 
using the objects in a personalized way (actually 
reminiscent of relabeling)—attributing human 
emotions and actions to them (“This ball was 
lonely and so he went over to talk to the ba- 
nana”). Only 2 of the second graders used this 
approach, but 7 of the fourth graders and 10 of 
the sixth graders relied upon it. 

No apparent contribution by intelligence to 
the operation of suppression was discovered: 
There was no relationship between IQ and 
method of suppression used by subjects in any 
of the three age groups, nor was IQ related to 
inability to perform the task at all. 


Discussion 


Within the limits of the specific experimental 
procedure employed, some clear developmental 
trends in the operation of suppression were re- 
vealed, First, the near impossibility of testing 
children in the first grade and younger and the 
fact that the second graders who failed the task 
were the younger subjects in their class strongly 
suggest that children must be approximately 8 
years of age before the basic intellectual tools 
required for suppression are available to them. 

From ages 8 through 12, however, a very lim- 
ited number of methods of eliminating a for- 
bidden stimulus become available. Perhaps the 
crudest of these consists of focusing on other 
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objects in the picture and then simply describing 
these, thereby failing to create an organized 
story. The youngest subjects depended rather 
heavily on this approach, which was used spar- 
ingly by all older subjects. 

Other methods used by the grade-school sub- 
jects, however, were only slightly less primitive, 
consisting either of a simple relabeling of the 
main stimulus figure or of the closely related 
technique of shifting the story’s emphasis to 
personified objects. If the basic function of sup- 
pression is to eliminate a thought, idea, or per- 
cept so that a new intellectual task may be un- 
dertaken, both these techniques are relative fail- 
ures, since both leave the original thought struc- 
ture relatively unaltered except for a new name 
or focus. 

Two other methods used by the older subjects 
of the previous study (Korner, 1966) were ap- 
parently not available to these younger subjects. 
The first, which is in use by age 16, seemed to 
represent the task of elimination in progress and 
consisted of stories which involved a boy at the 
beginning but arranged for him to leave before 
the end of the story. This method is somewhat 
less primitive in that it eventually accomplished 
the experimental task of suppression somewhat 
more successfully, although only at the expense 
of temporarily disregarding explicit instructions. 
The only method which truly and immediately 
eliminates the entire thought-gestalt did not 
appear before the graduate-study level and then 
was used only by a small percentage of the 
graduate subjects; it consisted of using the in- 
tellectual skills of abstraction and generalization 
to eliminate the whole gestalt as well as the spe- 
cific forbidden stimulus, 

One incidental observation in this series of 
studies is the frequent appearance of hostility, 
violence, and aggression in the stories of subjects 
at all age levels, from second grade through 
graduate school. No immediate explanation oc- 
curs for this phenomenon, though it seems in- 
adequately accounted for by the fact that the 
task is somewhat difficult and frustrating. 

Finally, it may be argued that the specific 
procedure employed necessitates at least a par- 
tial interdependence among the judged ability to 
suppress, general verbal abilities, and the more 
specific ability to make up a story. The finding of 
zero correlation between IQ and suppressive 
ability would seem to discount the importance of 
general verbal abilities, but it is true that the 
ability to make up a story, which remains rather 
constant during the grade-school years, shows 
considerable change from grade school to high 
school, principally in embellishment of character, 
setting, and plot. Despite these embellishments, 


NOTES AND COMMENTS 


however, the fundamental operations used in 
accomplishing the experimental task change very 
little from age to age. Moreover, with the excep- 
tion of a few graduate students, these operations 
remain surprisingly primitive and rather ineffec- 
tive. An intercultural study is currently under- 
way in an effort to illuminate more precisely 
the role of language in suppressive operations. 
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COMPARISON OF WAIS M-F INDEX WITH TWO MEASURES 
OF MASCULINITY-FEMININITY 


DOROTHEA McCARTHY, FREDERICK M. SCHIRO, ann JOHN P. SUDIMACK 


Fordham University 


Wechsler suggested an M-F index based on sex differences in performance on 6 
subtests of the WAIS which he claimed could probably be interpreted as 
comparable to the Terman-Miles Attitude-Interest Analysis Test and other 
masculinity-femininity measures. The present study is based on 80 college 
students, 40 males and 40 females, who took short forms of the WAIS, the 
Terman-Miles, and the Guilford-Martin inventories. Negligible correlations 
were obtained for both sexes between each of the masculinity-femininity per- 
sonality measures and the Wechsler M-F index. Thus, sex differences in pat- 
terning of intellectual abilities on the WAIS are not the same as traditional 


measures of masculinity-femininity. 


In his chapter on sex differences, Wechsler 
(1958) identified six subtests of the WAIS which 
yield sex differences, three in favor of males and 
three in favor of females. He suggested an M-F 
index computed by using the sum of the weighted 
scores for the masculine tests (Information, 
Arithmetic, and Picture Completion) minus the 
sum of the weighted scores of the feminine tests 
(Vocabulary, Similarities, and Digit Symbol). 
Resulting positive scores he considered indicative 
of a “masculine trend,” and negative scores were 
claimed to be indicative of a “feminine trend.” 
After showing that this measure yielded a sig- 
nificant difference between 300 males and 300 
females, Wechsler (1958) stated: “Thus, one 
can obtain an MF score on the WAIS compara- 
ble to MF scores on standard masculinity-femi- 
ninity tests like the Miles-Terman or the MMPI, 
with possible comparable interpretation [p. 149; 
italics added].” 

Tt occurred to the writers that although Wech- 
sler had demonstrated sex differences in the pat- 
terning of intellectual abilities as measured by 
the WAIS, the resulting index was probably not 
the same personality dimension of masculinity- 
femininity that has traditionally been employed 
in the psychological literature, and that to label 
such an index based on sex differences in the 


patterning of intellectual abilities an M-F index, 
in the same sense in which it is used in person- 
ality testing, would be erroneous and seriously 
misleading. 

In an effort to clarify this point, the present 
study was undertaken to make a comparison of 
Wechsler’s suggested M-F index with two well- 
known personality tests long recognized as valid 
and reliable measures of masculinity-femininity, 
namely, the Terman-Miles Attitude-Interest 
Analysis Test (Terman & Miles, 1936a, 1936b) 
and the M scale of the Guilford-Martin Inven- 
tory of Factors GAMIN (Guilford & Martin, 
1943; Martin, 1945). 

The subjects for the present study consisted 
of 80 college students attending a coeducational 
college in a middle-Atlantic state. There were 40 
men and 40 women between the ages of 17 and 
25 years who volunteered to participate in the 
testing program. All were born in the United 
States and had at least one parent also native 
born. Two examiners administered a short form 
of the WAIS consisting of the six subtests re- ' 
quired to obtain the M-F index and one of the 
personality tests to half the subjects. Each ex- 
amined 20 men and 20 women. The male sub- 
jects were, on the average, 1 year older than the 
females, but this difference was not statistically 
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significant. The two examiners independently 
scored samples of the Similarities and Vocabu- 
lary subtests of the WAIS, and no consistent 
bias toward strictness or leniency was found on 
these tests in which subjectivity of scoring might 
have entered. 

Instructions to the subjects were as follows: 


We are doing research on the relationship be- 
tween abilities and interests. The test that I am 
going to give you is a test of aptitude and abili- 
ties. Following this test you will be asked to fill 
out a form which will give us an idea of your 
interests, We are only interested in your scores 
on a statistical basis and no permanent record 
of your scores will be kept. 


All subjects included were naive as to the 
purpose of the instruments involved. They were 
reassured that no scores would be made available 
to the authorities at the college they were at- 
tending. 

Pearson r correlation coefficients were com- 
puted for the males and for the females sepa- 
rately between the Wechsler M-F scores and the 
Scores on each of the personality tests employed. 

As can be seen from the correlations presented 
in Table 1, there is no relationship between the 
Wechsler M-F index and the M-F scores ob- 
tained with the Terman-Miles Attitude-Interest 
Analysis Test for either sex. Likewise, no rela- 
tionship was found for either sex between the 
Wechsler M-F scale and the Guilford-Martin M 
score, which is considered indicative of the per- 
sonality trait of masculinity. It would appear, 


NOTES AND COMMENTS 


TABLE 1 


CORRELATIONS OF WECHSLER M-F INDEX WITH THE 
M-F Scores on Two PERSONALITY TESTS 


Terman-Miles Guilford-Martin 


+.10 
—.07 


*—.14 
+.02 


Males 
Females 


Males 
Females 


then, that interpretation of the M-F index on 
the WAIS should be limited to sex differences in 
the patterning of intellectual abilities as mea- 
sured by the WAIS. Care should be taken to 
avoid any implication of sexual inversion of per- 
sonality traits for either men or women at the 
college level. 
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BRIEF REPORTS 


ABBREVIATED FORM OF THE WISC: 
A REEVALUATION 1 


ROBERT V. ERIKSON 
University of North Dakota 


An innovation in developing an abbreviated 
form of the WISC was that done by Yudin 
(1966). Employing selective items on 9 of the 
11 subtests while administering the complete 
Digit Span and Coding subtests, an abbreviated 
form was developed which would both provide 
a saving in time and lend itself to the type 
of scatter analysis that is possible with the full 
WISC. 

Records of 100 subjects who had previously 
been given the full WISC were selected with 
no regard to age, sex, or IQ from the files of 
the North Dakota State Department of Health. 
The Full Scale IQs of the 34 females and 66 
males ranged 49-129 with a mean of 89.07 and 
a standard deviation of 17.01, The ages ranged 
from 6 years, 9 months to 15 years, 9 months 
with a mean of 11 years, 0 months, All 100 
WISCs were rescored using the abbreviated form 
according to the arrangements set forth by Yudin 
(1966, p. 273). 

The correlations, on the present sample, be- 
tween the original and abbreviated forms of the 
subtests ranged from .80 on Picture Completion 
to .94 on Object Assembly. The following 1Q 
correlations were obtained: Verbal IQ, .96; 
Performance IQ, .93; and Full Scale IQ, .98. 
With the exceptions of Similarities at the .01 
level, Object Assembly at the .05 level, and 
Picture Arrangement, which was not significant, 


1 An extended report of this study may be ob- 
tained without charge from Robert V. Erikson, 
Youth Development Center, Loysville, Pennsylvania, 
or for a fee from the American Documentation 
Institute. Order Document No. 9652 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington, D. C. 
20540. Remit in advance $1.25 for microfilm or $1.25 
for photocopies, and make checks payable to: Chief, 
Photoduplication Service, Library of Congress. 
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all substest and scale differences were significa 
using ¢ tests, beyond the .001 level. In eve 
case, except for one tie, the abbreviated fo: 
gave a more conservative IQ score. Overall, 
the original scoring of the WISC protocol, t 
following mean IQ scores were obtained: Verl 
IQ of 87.89, Performance IQ of 92.46, and Fi 
Scale IQ of 89.07, as compared to the respecti 
IQ scores obtained when using the abbreviat 
form: 80.84, 81.92, and 81.29. 

Full Scale IQs grouped by age, that is, fi 
groups covering 2-year intervals (6 years, 
months to 7 years, 11 months; 8.0-9.11; 10, 
11.11; 12.0-13.11; and 14,0-15.11), and Fu 
Scale IQs grouped by IQ, that is, four grou 
covering 20-point intervals (49-69; 70-8 
90-109; and 110-129), were compared, Age grou 
correlations were all above .95, while IQ grou 
correlations ranged from .81 for the 90-10 
group to .96 for the 110-129 group. For all ag 
and IQ groups, the differences between th 
original and abbreviated forms were significan 
using £ tests, beyond the .001 level. 

While the obtained correlations are ver: 
similar to those of Yudin (1966), the over 
whelming number of highly significant 2’s sug 
gests that the abbreviated form of thi 
WISC, as put forward by Yudin, does not stan 
up under cross-validation. A lack of an adequat 
item pool seems to be the biggest contributing 
factor. In conclusion, it is felt that the WISC 
is a valuable clinical tool, and the saving ir 
time is not worth this lost clinical information 


REFERENCE 


Yupry, W. An abbreviated form of the WISC for 
use with emotionally disturbed children. Journal 
of Consulting Psychology, 1966, 30, 272-275. 


(Received April 28, 1967) 


Journal of Consulting Psychology 
1967, Vol. 31, No. 6, 642 


FREQUENCY AND CONTENT OF TEST ITEMS NORMALLY 
OMITTED FROM MMPI SCALES? 


MELVIN A. GRAVITZ 


American University 


There are many reasons why individual test 
items are left unanswered on the MMPI, includ- 
ing the possibility that certain words or phrases 
may have an ambiguous or idiosyncratic mean- 

' ing, poor motivation, deliberate resistance, lack 
of energy, or emotional disturbance. Unanswered 
test items are important, however, because they 
are automatically removed from the scoring of 
the several validity and clinical scales, and the 
effect of such omissions is to reduce the length 
of the test, shrink the variance, and attenuate 
the profite. Despite these important consequences 
and the wide use of the MMPI in assessment and 
research, little data is available which is con- 
cerned. with the normal frequency and content 
of “Cannot Say” responses. 

In a related study with college males Mosher 
(1966) concluded that frequently omitted items 
were more ambiguous, less relevant to the 
evaluation of psychopathology, and less personal 
and private than were rare omissions. 

It was the purpose of the present investigation 
to obtain from a large normal adult sample the 
test items most frequently omitted from the 
3 validity and 10 clinical scales. The booklet 
form of the MMPI was routinely administered 
as part of preemployment screening to 7,149 
males and 4,816 females who were applicants 
for a wide variety of vocational positions and 
who were considered to be normal on the basis 
of no emotional difficulty evident upon pretest 
interview. The age span was 17-60, and educa- 
tional level ranged from less than a high school 


1 An extended report of this study may be obtained - 


without charge from Melvin A. Gravitz, 8113 Cindy 
Lane, Bethesda, Maryland 20034, or for a fee from 
the American Documentation Institute. Order Docu- 
ment No. 9653 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington, D. C. 20540, Remit in advance 
$1.75 for microfilm or $2.50 for photocopies, and 
make checks payable to: Chief, Photoduplication 
Service, Library of Congress, 


diploma to the doctoral degree. The 15 test 
items most frequently omitted from each scale 
were obtained, together with the number of male 
and female Ss who left that item blank, These 
frequencies ranged from 18.4% to less than 1% 
of the total sample. The most frequent omissions 
(ie., left blank by 5% or more) fell into only 
six content areas. It was further found that men 
most often omitted, in descending order, 23 
different items dealing with personal attitudes 


and interests, religion, sex, fears, and politics. 


and law and order. Women most commonly 
omitted 16 different items concerned with per- 
sonal attitudes and interest, sex, family, religion, 
politics and law and order, and fears, in that 
order. Nine items overlapped, in that they were 
left blank by both sexes. 

Unlike Mosher’s (1966) finding, that fre- 
quently omitted items were described as less 
personal than those which were rarely left 
blank, the present study indicates the reverse, 
which is consistent with the logic that one would 
be more reluctant and not less reluctant to 
respond openly to topics of personalized inquiry. 
Indeed, it is noteworthy that so few of the 
“privacy” MMPI items were omitted by large 
numbers of Ss, but perhaps our society is be- 
coming conditioned and inured to the taking 
of psychological tests. 


It was concluded that the “Cannot Say” items | 


are a reflection of resistance on the part of S; but 
rather than being a manifestation of generalized 


test-taking resistance, MMPI omissions more spe- # 
cifically appear to represent a disinclination to} 


respond openly to test content which patently 
probes personal and private feelings. 
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